分区条件语句
Partition by with condition statement
我有各个商店销售的产品的数据。对于某些商店,它们以 PROMO_FLG
映射的折扣出售。
我想显示两个 COUNT PARTITION
列。
+-------------------------+--------------+---------------------+
| Store | Item | PROMO_FLG|
|-------------------------+--------------+---------------------|
| 1 | 1 | 0 |
| 2 | 1 | 1 |
| 3 | 1 | 0 |
| 4 | 1 | 0 |
| 5 | 1 | 1 |
| 6 | 1 | 1 |
| 7 | 1 | 1 |
| 8 | 1 | 0 |
| 9 | 1 | 0 |
| 10 | 1 | 0 |
+-------------------------+--------------+---------------------+
先显示所有有该商品的店铺(已完成)
COUNT(DISTINCT STORE) OVER (PARTITION ITEM)
会给出 10
第二个 - 我正在寻找 - 仅计算这些在 PROMO_FLG = 1
属性中具有价值的商店。
那应该给我们 4
的价值
我想你想要:
select t.*,
count(*) over (partition by item) as num_stores,
sum(promo_flg) over (partition by item) as num_promo_1
from t;
如果您确实需要不同的计数:
select t.*,
count(distinct store) over (partition by item) as num_stores,
count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
from t;
Here 是一个 db<>fiddle。 fiddle 使用 Oracle,因为它支持 COUNT(DISTINCT)
作为 window 函数。
如果 window 函数不起作用,这里有一个替代方法:
select *
from t join
(select item, count(distinct store) as num_stores, count(distinct case when promo_flg = 1 then store end) as num_stores_promo
from t
group by item
) tt
using (item);
第二个使用 Gordon SQL 但显示它在 Snowflake 中工作
select v.*
,count(distinct store) over (partition by item) as num_stores
,count(distinct iff(promo_flg = 1, store, null)) over (partition by item) as num_dis_promo_stores
,sum(iff(promo_flg = 1, 1, 0)) over (partition by item) as num_sum_promo_stores
from values
(1 , 1, 0 ),
(2 , 1, 1 ),
(3 , 1, 0 ),
(4 , 1, 0 ),
(5 , 1, 1 ),
(6 , 1, 1 ),
(7 , 1, 1 ),
(8 , 1, 0 ),
(9 , 1, 0 ),
(10, 1, 0 )
v(store, item, promo_flg) ;
给出:
STORE ITEM PROMO_FLG NUM_STORES NUM_DIS_PROMO_STORES NUM_SUM_PROMO_STORES
1 1 0 10 4 4
2 1 1 10 4 4
3 1 0 10 4 4
4 1 0 10 4 4
5 1 1 10 4 4
6 1 1 10 4 4
7 1 1 10 4 4
8 1 0 10 4 4
9 1 0 10 4 4
10 1 0 10 4 4
因此,根据您是想要非重复计数还是总和,我使用了雪花支持的非标准 SQL 形式 iff
因为我更喜欢它更小 sql.
但是你可以看到它们在工作。
测试 Gordon 的第二个案例 count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
工作正常。
为了回应关于 Gordon 答案的 Marcin2x4 问题,您从方法中得到了不同的结果 if/when 数据与您描述的方式不同。因此,如果您的商店有一个项目和多行 promo_flg 存在。或者,如果 promo_flg 具有非零值:
select v.*
,count(distinct store) over (partition by item) as num_stores
,count(distinct iff(promo_flg = 1, store, null)) over (partition by item) as num_dis_promo_stores
,sum(iff(promo_flg <> 0, 1, 0)) over (partition by item) as num_sum_promo_stores
,sum(promo_flg) over (partition by item) as num_promo_1
,count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
from values
(1 , 1, 0 ),
(2 , 1, 1 ),
(3 , 1, 0 ),
(4 , 1, 0 ),
(5 , 1, 1 ),
(6 , 1, 1 ),
(7 , 1, 1 ),
(8 , 1, 0 ),
(9 , 1, 0 ),
(10, 1, 0 ),
(7, 1, 1 ),
(7, 1, 2 )
v(store, item, promo_flg) ;
然后num_dis_promo_stores
&num_promo_1
给出4,num_sum_promo_stores
给出6,&num_promo_1
给出7
我有各个商店销售的产品的数据。对于某些商店,它们以 PROMO_FLG
映射的折扣出售。
我想显示两个 COUNT PARTITION
列。
+-------------------------+--------------+---------------------+
| Store | Item | PROMO_FLG|
|-------------------------+--------------+---------------------|
| 1 | 1 | 0 |
| 2 | 1 | 1 |
| 3 | 1 | 0 |
| 4 | 1 | 0 |
| 5 | 1 | 1 |
| 6 | 1 | 1 |
| 7 | 1 | 1 |
| 8 | 1 | 0 |
| 9 | 1 | 0 |
| 10 | 1 | 0 |
+-------------------------+--------------+---------------------+
先显示所有有该商品的店铺(已完成)
COUNT(DISTINCT STORE) OVER (PARTITION ITEM)
会给出 10
第二个 - 我正在寻找 - 仅计算这些在 PROMO_FLG = 1
属性中具有价值的商店。
那应该给我们 4
我想你想要:
select t.*,
count(*) over (partition by item) as num_stores,
sum(promo_flg) over (partition by item) as num_promo_1
from t;
如果您确实需要不同的计数:
select t.*,
count(distinct store) over (partition by item) as num_stores,
count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
from t;
Here 是一个 db<>fiddle。 fiddle 使用 Oracle,因为它支持 COUNT(DISTINCT)
作为 window 函数。
如果 window 函数不起作用,这里有一个替代方法:
select *
from t join
(select item, count(distinct store) as num_stores, count(distinct case when promo_flg = 1 then store end) as num_stores_promo
from t
group by item
) tt
using (item);
第二个使用 Gordon SQL 但显示它在 Snowflake 中工作
select v.*
,count(distinct store) over (partition by item) as num_stores
,count(distinct iff(promo_flg = 1, store, null)) over (partition by item) as num_dis_promo_stores
,sum(iff(promo_flg = 1, 1, 0)) over (partition by item) as num_sum_promo_stores
from values
(1 , 1, 0 ),
(2 , 1, 1 ),
(3 , 1, 0 ),
(4 , 1, 0 ),
(5 , 1, 1 ),
(6 , 1, 1 ),
(7 , 1, 1 ),
(8 , 1, 0 ),
(9 , 1, 0 ),
(10, 1, 0 )
v(store, item, promo_flg) ;
给出:
STORE ITEM PROMO_FLG NUM_STORES NUM_DIS_PROMO_STORES NUM_SUM_PROMO_STORES
1 1 0 10 4 4
2 1 1 10 4 4
3 1 0 10 4 4
4 1 0 10 4 4
5 1 1 10 4 4
6 1 1 10 4 4
7 1 1 10 4 4
8 1 0 10 4 4
9 1 0 10 4 4
10 1 0 10 4 4
因此,根据您是想要非重复计数还是总和,我使用了雪花支持的非标准 SQL 形式 iff
因为我更喜欢它更小 sql.
但是你可以看到它们在工作。
测试 Gordon 的第二个案例 count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
工作正常。
为了回应关于 Gordon 答案的 Marcin2x4 问题,您从方法中得到了不同的结果 if/when 数据与您描述的方式不同。因此,如果您的商店有一个项目和多行 promo_flg 存在。或者,如果 promo_flg 具有非零值:
select v.*
,count(distinct store) over (partition by item) as num_stores
,count(distinct iff(promo_flg = 1, store, null)) over (partition by item) as num_dis_promo_stores
,sum(iff(promo_flg <> 0, 1, 0)) over (partition by item) as num_sum_promo_stores
,sum(promo_flg) over (partition by item) as num_promo_1
,count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
from values
(1 , 1, 0 ),
(2 , 1, 1 ),
(3 , 1, 0 ),
(4 , 1, 0 ),
(5 , 1, 1 ),
(6 , 1, 1 ),
(7 , 1, 1 ),
(8 , 1, 0 ),
(9 , 1, 0 ),
(10, 1, 0 ),
(7, 1, 1 ),
(7, 1, 2 )
v(store, item, promo_flg) ;
然后num_dis_promo_stores
&num_promo_1
给出4,num_sum_promo_stores
给出6,&num_promo_1
给出7