配置单元的聚合滑动 Window

Aggregate Sliding Window for hive

我有一个配置单元 table,它是根据数值排序的,比如计数。

fruit   count
------  -------
apple   10
orange  8
banana  5
melon   3
pears   1

总数是27,我需要分成三段。所以计数的前 1/3,即 1 到 9 是一,10 到 18 是第二,19 到 27 是第三。 我想我需要做一些滑动 window.

fruit   count    zone
------  ------- --------
apple   10      one
orange  8       two
banana  5       three
melon   3       three
pears   1       three

知道如何解决这个问题

SQL方式:

select *,
(
sum(count)  over (partition by 1 order by count desc) /*<---this line for return running totals*/
/(sum(count) over (partition by 1) /3) /*<-- divided total count into 3 group. In your case this is 9 for each zone value.*/
) /*<--using running totals divided by zone value*/
+ /*<-- 11 / 9 = 1 ... 2  You must plus 1 with quotient to let 11 in the right zone.Thus,I use this + operator  */
(
case when 
(
sum(count)  over (partition by 1 order by count desc)
%(sum(count) over (partition by 1) /3) /*<--calculate remainder */
) >1 then 1 else 0 end /*<--if remainder>1 then the zone must +1*/
)  as zone
from yourtable