HIVE运行出错:Error: Error while compiling statement: FAILED: SemanticException [Error 10226]: An addi...
WITH tmp as(
select
a.*,
if(b.guid is not null,null,a.guid) as nu -- 是否新用户的标记:老用户=null,新用户=自己的guid
from
(
SELECT * FROM dwd_apl_utm_dtl WHERE dt='2020-03-15'
) a
left join
(
SELECT * FROM dws_apl_hsu_rec WHERE dt='2020-03-15'
) b
on a.guid=b.guid
)
INSERT INTO TABLE ads_apl_utm_ovw PARTITION(dt='2020-03-15')
SELECT
event['utm_source'],
event['utm_medium'],
event['utm_content'],
event['utm_campain'],
event['utm_term'],
count(1) as pv_cnts,
count(distinct sessionid) as se_cnts,
count(distinct guid) as uv_cnts,
count(distinct nu) as nu_cnts
FROM tmp
group by event['utm_source'],event['utm_medium'],event['utm_content'],event['utm_campain'],event['utm_term']
with cube;
Error: Error while compiling statement: FAILED: SemanticException [Error 10226]: An additional MR job is introduced since the cardinality of grouping sets is more than hive.new.job.grouping.set.cardinality. This functionality is not supported with distincts. Either set hive.new.job.grouping.set.cardinality to a high number (higher than the number of rows per input row due to grouping sets in the query), or rewrite the query to not use distincts. The number of rows per input row due to grouping sets is 32 (state=42000,code=10226)
因为用到 with cube
解决办法:设置set hive.new.job.grouping.set.cardinality = 32
这个要设置的数,要大于group by 组合和个数,如这里,要组合5个,所以 2^5=32,所以要设置大于等于32