解决 Hive 中不受支持的 Correlated Where 子查询

Working around unsupported Correlated Where Subqueries in Hive

我正在尝试解决 HIVE 不支持相关子查询这一事实。最终,我一直在计算上个月每周数据中存在多少项,现在我想知道本周有多少项掉线、返回或全新。如果我可以使用 where 子查询,那就不会太难了,但是如果没有它,我很难想出解决办法。

Select
count(distinct item)
From data
where item in (Select item from data where date <= ("2016-05-10"))
And date between "2016-05-01" and getdate()

任何帮助都会很棒。谢谢。

变通方法是左连接两个结果集并且第二个结果集列为空。

      Select count (a.item) 
            from 
                (select distinct  item from data where date between "2016-05-01" and getdate()) a
            left join (Select distinct  item from data where date <=  ("2016-05-10")) b
            on a.item =b.item
            and b.item is null