如何在 Snowflake 中旋转多重聚合
How to pivot multiple aggregation in Snowflake
我的table结构如下
product_id
Period
Sales
Profit
x1
L13
0
x1
L26
0
x1
L52
0
x2
L13
0
0
x2
L26
0
0
x2
L52
0
0
我想旋转期间列,并在这些列中显示销售额和利润。我需要像下面这样的 table。
product_id
SALES_L13
SALES_L26
SALES_L52
PROFIT_L13
PROFIT_L26
PROFIT_L52
x1
0
0
0
x2
0
0
0
0
0
0
我正在使用雪花来编写查询。我尝试使用 snowflake 的 pivot
函数,但我只能指定一个聚合函数。
任何人都可以帮助我实现这个解决方案吗?
感谢任何帮助。
谢谢
使用条件聚合:
SELECT product_id
,SUM(CASE WHEN Period = 'L13' THEN Sales END) AS SALES_L13
,SUM(CASE WHEN Period = 'L26' THEN Sales END) AS SALES_L26
,SUM(CASE WHEN Period = 'L52' THEN Sales END) AS SALES_L52
,SUM(CASE WHEN Period = 'L13' THEN Profit END) AS PROFIT_L52
,SUM(CASE WHEN Period = 'L26' THEN Profit END) AS PROFIT_L52
,SUM(CASE WHEN Period = 'L52' THEN Profit END) AS PROFIT_L52
FROM tab
GROUP BY product_id
我相信您一次只能有一个枢轴,但您可以通过 运行ning 下面的第一个代码进行检查。然后你可以单独运行只用一个pivot,看看它是否工作正常。不幸的是,如果不允许多个枢轴,即第一个代码,那么您可以使用第三个代码,即当方法或首先使用 union 组合它们时的情况,即(上面的 Phil Culson 方法)。
select *
from [table name]
pivot(sum(amount) for PERIOD in (L13, L26, L52)),
pivot(sum(profit) for PERIOD in (L13, L26, L52))
order by product_id;
如果上述方法不起作用,请尝试使用以下一种方法:
https://count.co/sql-resources/snowflake/pivot-tables
select *
from [table name]
pivot(sum(amount) for PERIOD in (L13, L26, L52))
order by product_id;
否则您将不得不应用手动 case when
逻辑:
select
product_id,
sum(case when Period = 'L13' then Sales end) as sales_l13,
sum(case when Period = 'L26' then Sales end) as sales_l26,
sum(case when Period = 'L52' then Sales end) as sales_l52,
sum(case when Period = 'L13' then Profit end) as profi_l13,
sum(case when Period = 'L26' then Profit end) as profit_l26,
sum(case when Period = 'L52' then Profit end) as profit_l52
from [table name]
group by 1
我们在转向之前叠加销售额和利润怎么样?我会把它留给你来修复我弄乱的列名。
with cte (product_id, period, amount) as
(select product_id, period||'_profit', profit from t
union all
select product_id, period||'_sales', sales from t)
select *
from cte
pivot(max(amount) for period in ('L13_sales','L26_sales','L52_sales','L13_profit','L26_profit','L52_profit'))
as p (product_id,L13_sales,L26_sales,L52_sales,L13_profit,L26_profit,L52_profit);
如果您希望为销售额和利润透视周期两次,则需要复制该列,以便每个透视实例都有一个。显然,由于在第一个数据透视表之后仍然存在重复列,这将创建空值。为了解决这个问题,我们可以在最后的 select 中使用 max
。这是实现的样子
select product_id,
max(L13_sales) as L13_sales,
max(L26_sales) as L26_sales,
max(L52_sales) as L52_sales,
max(L13_profit) as L13_profit,
max(L26_profit) as L26_profit,
max(L52_profit) as L52_profit
from (select *, period as period2 from t) t
pivot(max(sales) for period in ('L13','L26','L52'))
pivot(max(profit) for period2 in ('L13','L26','L52'))
as p (product_id, L13_sales,L26_sales,L52_sales,L13_profit,L26_profit,L52_profit)
group by product_id;
至此,眼前一亮。您也可以使用 conditional aggregation
或更好的方法,在报告应用程序中处理数据透视。 conditional aggregation
的更紧凑替代方案使用 decode
select product_id,
max(decode(period,'L13',sales)) as L13_sales,
max(decode(period,'L26',sales)) as L26_sales,
max(decode(period,'L52',sales)) as L52_sales,
max(decode(period,'L13',profit)) as L13_profit,
max(decode(period,'L26',profit)) as L26_profit,
max(decode(period,'L52',profit)) as L52_profit
from t
group by product_id;
我对这个答案不是 100% 满意......很确定有人可以改进这种方法。
基本上PIVOTING an ARRAY ... the list of aggregation functions available to an ARRAY is not huge ... there's just one ARRAY_AGG. And PIVOT只支持AVG、COUNT、MAX、MIN和SUM。所以这不应该工作......它确实像我认为的那样 PIVOT 只需要某种聚合。
我建议在构建 ARRAY 之前聚合您的指标...但确实可以让您一次旋转多个指标 - 从阅读 Stack Overflow 来看应该是不可能的!
复制|粘贴|运行| .. 请改进 :-)
WITH CTE AS( SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100 SALES,10 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200 SALES,20 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300 SALES,30 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500 SALES,110 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600 SALES,120 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700 SALES,130 PROFIT)
SELECT
PRODUCT_ID
,"'L13'"[0][0] SALES_L13
,"'L13'"[0][1] PROFIT_L13
,"'L26'"[0][0] SALES_L26
,"'L26'"[0][1] PROFIT_L26
,"'L52'"[0][0] SALES_L52
,"'L52'"[0][1] PROFIT_L52
FROM
(SELECT * FROM
(
SELECT PRODUCT_ID, PERIOD,ARRAY_CONSTRUCT(SALES,PROFIT) S FROM CTE)
PIVOT (ARRAY_AGG(S) FOR PERIOD IN ('L13','L26','L52')
)
)
聚合示例(将 1700,1130 添加到 L52 X2)
WITH CTE AS(
SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100 SALES,10 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200 SALES,20 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300 SALES,30 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500 SALES,110 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600 SALES,120 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700 SALES,130 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,1700 SALES,1130 PROFIT)
SELECT
PRODUCT_ID
,"'L13'"[0][0] SALES_L13
,"'L13'"[0][1] PROFIT_L13
,"'L26'"[0][0] SALES_L26
,"'L26'"[0][1] PROFIT_L26
,"'L52'"[0][0] SALES_L52
,"'L52'"[0][1] PROFIT_L52
FROM
(SELECT * FROM
(
SELECT PRODUCT_ID, PERIOD,ARRAY_CONSTRUCT(SUM(SALES),SUM(PROFIT)) S FROM CTE GROUP BY 1,2)
PIVOT (ARRAY_AGG(S) FOR PERIOD IN ('L13','L26','L52')
)
)
这是使用 OBJECT_AGG
和 LATERAL FLATTEN
的替代形式,它避免了 Adrian White 提出的 PIVOT
和 ARRAY_AGG
的潜在支持问题。
这应该适用于 OBJ_TALL
CTE 中初始 ARRAY_CONSTRUCT
中包含的多个输入列上的任何聚合。我希望带有 CASE
语句的条件聚合选项会更快,但您需要进行大规模测试才能看到。
-- OBJECT FORM USING LATERAL FLATTEN
WITH CTE AS(
SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100 SALES,10 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200 SALES,20 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300 SALES,30 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500 SALES,110 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600 SALES,120 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700 SALES,130 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,1700 SALES,1130 PROFIT)
,OBJ_TALL AS ( SELECT PRODUCT_ID,
OBJECT_CONSTRUCT(PERIOD,
ARRAY_CONSTRUCT( SUM(SALES)
,SUM(PROFIT)
)
) S
FROM CTE
GROUP BY PRODUCT_ID, PERIOD)
SELECT * FROM OBJ_TALL;
,OBJ_WIDE AS ( SELECT PRODUCT_ID, OBJECT_AGG(KEY,VALUE) OA
FROM OBJ_TALL, LATERAL FLATTEN(INPUT => S)
GROUP BY PRODUCT_ID)
-- SELECT * FROM OBJ_WIDE;
SELECT
PRODUCT_ID
,OA:L13[0] SALES_L13
,OA:L13[1] PROFIT_L13
,OA:L26[0] SALES_L26
,OA:L26[1] PROFIT_L26
,OA:L52[0] SALES_L52
,OA:L52[1] PROFIT_L52
FROM OBJ_WIDE
ORDER BY 1;
为了方便与上面的比较,这里是使用 CTE 重新格式化的 Adrians ARRAY_AGG
和 PIVOT
版本。
-- ARRAY FORM - RE-WRITTEN WITH CTES FOR CLARITY AND COMPARISON TO OBJECT FORM
WITH CTE AS(
SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100 SALES,10 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200 SALES,20 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300 SALES,30 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500 SALES,110 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600 SALES,120 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700 SALES,130 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,1700 SALES,1130 PROFIT)
,ARR_TALL AS (SELECT PRODUCT_ID,
PERIOD,
ARRAY_CONSTRUCT( SUM(SALES)
,SUM(PROFIT)
) S
FROM CTE GROUP BY 1,2)
,ARR_WIDE AS (SELECT *
FROM ARR_TALL PIVOT (ARRAY_AGG(S) FOR PERIOD IN ('L13','L26','L52') ) )
SELECT
PRODUCT_ID
,"'L13'"[0][0] SALES_L13
,"'L13'"[0][1] PROFIT_L13
,"'L26'"[0][0] SALES_L26
,"'L26'"[0][1] PROFIT_L26
,"'L52'"[0][0] SALES_L52
,"'L52'"[0][1] PROFIT_L52
FROM ARR_WIDE
ORDER BY 1;
我的table结构如下
product_id | Period | Sales | Profit |
---|---|---|---|
x1 | L13 | 0 | |
x1 | L26 | 0 | |
x1 | L52 | 0 | |
x2 | L13 | 0 | 0 |
x2 | L26 | 0 | 0 |
x2 | L52 | 0 | 0 |
我想旋转期间列,并在这些列中显示销售额和利润。我需要像下面这样的 table。
product_id | SALES_L13 | SALES_L26 | SALES_L52 | PROFIT_L13 | PROFIT_L26 | PROFIT_L52 |
---|---|---|---|---|---|---|
x1 | 0 | 0 | 0 | |||
x2 | 0 | 0 | 0 | 0 | 0 | 0 |
我正在使用雪花来编写查询。我尝试使用 snowflake 的 pivot
函数,但我只能指定一个聚合函数。
任何人都可以帮助我实现这个解决方案吗?
感谢任何帮助。
谢谢
使用条件聚合:
SELECT product_id
,SUM(CASE WHEN Period = 'L13' THEN Sales END) AS SALES_L13
,SUM(CASE WHEN Period = 'L26' THEN Sales END) AS SALES_L26
,SUM(CASE WHEN Period = 'L52' THEN Sales END) AS SALES_L52
,SUM(CASE WHEN Period = 'L13' THEN Profit END) AS PROFIT_L52
,SUM(CASE WHEN Period = 'L26' THEN Profit END) AS PROFIT_L52
,SUM(CASE WHEN Period = 'L52' THEN Profit END) AS PROFIT_L52
FROM tab
GROUP BY product_id
我相信您一次只能有一个枢轴,但您可以通过 运行ning 下面的第一个代码进行检查。然后你可以单独运行只用一个pivot,看看它是否工作正常。不幸的是,如果不允许多个枢轴,即第一个代码,那么您可以使用第三个代码,即当方法或首先使用 union 组合它们时的情况,即(上面的 Phil Culson 方法)。
select *
from [table name]
pivot(sum(amount) for PERIOD in (L13, L26, L52)),
pivot(sum(profit) for PERIOD in (L13, L26, L52))
order by product_id;
如果上述方法不起作用,请尝试使用以下一种方法: https://count.co/sql-resources/snowflake/pivot-tables
select *
from [table name]
pivot(sum(amount) for PERIOD in (L13, L26, L52))
order by product_id;
否则您将不得不应用手动 case when
逻辑:
select
product_id,
sum(case when Period = 'L13' then Sales end) as sales_l13,
sum(case when Period = 'L26' then Sales end) as sales_l26,
sum(case when Period = 'L52' then Sales end) as sales_l52,
sum(case when Period = 'L13' then Profit end) as profi_l13,
sum(case when Period = 'L26' then Profit end) as profit_l26,
sum(case when Period = 'L52' then Profit end) as profit_l52
from [table name]
group by 1
我们在转向之前叠加销售额和利润怎么样?我会把它留给你来修复我弄乱的列名。
with cte (product_id, period, amount) as
(select product_id, period||'_profit', profit from t
union all
select product_id, period||'_sales', sales from t)
select *
from cte
pivot(max(amount) for period in ('L13_sales','L26_sales','L52_sales','L13_profit','L26_profit','L52_profit'))
as p (product_id,L13_sales,L26_sales,L52_sales,L13_profit,L26_profit,L52_profit);
如果您希望为销售额和利润透视周期两次,则需要复制该列,以便每个透视实例都有一个。显然,由于在第一个数据透视表之后仍然存在重复列,这将创建空值。为了解决这个问题,我们可以在最后的 select 中使用 max
。这是实现的样子
select product_id,
max(L13_sales) as L13_sales,
max(L26_sales) as L26_sales,
max(L52_sales) as L52_sales,
max(L13_profit) as L13_profit,
max(L26_profit) as L26_profit,
max(L52_profit) as L52_profit
from (select *, period as period2 from t) t
pivot(max(sales) for period in ('L13','L26','L52'))
pivot(max(profit) for period2 in ('L13','L26','L52'))
as p (product_id, L13_sales,L26_sales,L52_sales,L13_profit,L26_profit,L52_profit)
group by product_id;
至此,眼前一亮。您也可以使用 conditional aggregation
或更好的方法,在报告应用程序中处理数据透视。 conditional aggregation
的更紧凑替代方案使用 decode
select product_id,
max(decode(period,'L13',sales)) as L13_sales,
max(decode(period,'L26',sales)) as L26_sales,
max(decode(period,'L52',sales)) as L52_sales,
max(decode(period,'L13',profit)) as L13_profit,
max(decode(period,'L26',profit)) as L26_profit,
max(decode(period,'L52',profit)) as L52_profit
from t
group by product_id;
我对这个答案不是 100% 满意......很确定有人可以改进这种方法。
基本上PIVOTING an ARRAY ... the list of aggregation functions available to an ARRAY is not huge ... there's just one ARRAY_AGG. And PIVOT只支持AVG、COUNT、MAX、MIN和SUM。所以这不应该工作......它确实像我认为的那样 PIVOT 只需要某种聚合。
我建议在构建 ARRAY 之前聚合您的指标...但确实可以让您一次旋转多个指标 - 从阅读 Stack Overflow 来看应该是不可能的!
复制|粘贴|运行| .. 请改进 :-)
WITH CTE AS( SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100 SALES,10 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200 SALES,20 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300 SALES,30 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500 SALES,110 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600 SALES,120 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700 SALES,130 PROFIT)
SELECT
PRODUCT_ID
,"'L13'"[0][0] SALES_L13
,"'L13'"[0][1] PROFIT_L13
,"'L26'"[0][0] SALES_L26
,"'L26'"[0][1] PROFIT_L26
,"'L52'"[0][0] SALES_L52
,"'L52'"[0][1] PROFIT_L52
FROM
(SELECT * FROM
(
SELECT PRODUCT_ID, PERIOD,ARRAY_CONSTRUCT(SALES,PROFIT) S FROM CTE)
PIVOT (ARRAY_AGG(S) FOR PERIOD IN ('L13','L26','L52')
)
)
聚合示例(将 1700,1130 添加到 L52 X2)
WITH CTE AS(
SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100 SALES,10 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200 SALES,20 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300 SALES,30 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500 SALES,110 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600 SALES,120 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700 SALES,130 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,1700 SALES,1130 PROFIT)
SELECT
PRODUCT_ID
,"'L13'"[0][0] SALES_L13
,"'L13'"[0][1] PROFIT_L13
,"'L26'"[0][0] SALES_L26
,"'L26'"[0][1] PROFIT_L26
,"'L52'"[0][0] SALES_L52
,"'L52'"[0][1] PROFIT_L52
FROM
(SELECT * FROM
(
SELECT PRODUCT_ID, PERIOD,ARRAY_CONSTRUCT(SUM(SALES),SUM(PROFIT)) S FROM CTE GROUP BY 1,2)
PIVOT (ARRAY_AGG(S) FOR PERIOD IN ('L13','L26','L52')
)
)
这是使用 OBJECT_AGG
和 LATERAL FLATTEN
的替代形式,它避免了 Adrian White 提出的 PIVOT
和 ARRAY_AGG
的潜在支持问题。
这应该适用于 OBJ_TALL
CTE 中初始 ARRAY_CONSTRUCT
中包含的多个输入列上的任何聚合。我希望带有 CASE
语句的条件聚合选项会更快,但您需要进行大规模测试才能看到。
-- OBJECT FORM USING LATERAL FLATTEN
WITH CTE AS(
SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100 SALES,10 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200 SALES,20 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300 SALES,30 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500 SALES,110 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600 SALES,120 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700 SALES,130 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,1700 SALES,1130 PROFIT)
,OBJ_TALL AS ( SELECT PRODUCT_ID,
OBJECT_CONSTRUCT(PERIOD,
ARRAY_CONSTRUCT( SUM(SALES)
,SUM(PROFIT)
)
) S
FROM CTE
GROUP BY PRODUCT_ID, PERIOD)
SELECT * FROM OBJ_TALL;
,OBJ_WIDE AS ( SELECT PRODUCT_ID, OBJECT_AGG(KEY,VALUE) OA
FROM OBJ_TALL, LATERAL FLATTEN(INPUT => S)
GROUP BY PRODUCT_ID)
-- SELECT * FROM OBJ_WIDE;
SELECT
PRODUCT_ID
,OA:L13[0] SALES_L13
,OA:L13[1] PROFIT_L13
,OA:L26[0] SALES_L26
,OA:L26[1] PROFIT_L26
,OA:L52[0] SALES_L52
,OA:L52[1] PROFIT_L52
FROM OBJ_WIDE
ORDER BY 1;
为了方便与上面的比较,这里是使用 CTE 重新格式化的 Adrians ARRAY_AGG
和 PIVOT
版本。
-- ARRAY FORM - RE-WRITTEN WITH CTES FOR CLARITY AND COMPARISON TO OBJECT FORM
WITH CTE AS(
SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100 SALES,10 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200 SALES,20 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300 SALES,30 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500 SALES,110 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600 SALES,120 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700 SALES,130 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,1700 SALES,1130 PROFIT)
,ARR_TALL AS (SELECT PRODUCT_ID,
PERIOD,
ARRAY_CONSTRUCT( SUM(SALES)
,SUM(PROFIT)
) S
FROM CTE GROUP BY 1,2)
,ARR_WIDE AS (SELECT *
FROM ARR_TALL PIVOT (ARRAY_AGG(S) FOR PERIOD IN ('L13','L26','L52') ) )
SELECT
PRODUCT_ID
,"'L13'"[0][0] SALES_L13
,"'L13'"[0][1] PROFIT_L13
,"'L26'"[0][0] SALES_L26
,"'L26'"[0][1] PROFIT_L26
,"'L52'"[0][0] SALES_L52
,"'L52'"[0][1] PROFIT_L52
FROM ARR_WIDE
ORDER BY 1;