如何在 Snowflake 中旋转多重聚合

How to pivot multiple aggregation in Snowflake

我的table结构如下

product_id Period Sales Profit
x1 L13 0
x1 L26 0
x1 L52 0
x2 L13 0 0
x2 L26 0 0
x2 L52 0 0

我想旋转期间列,并在这些列中显示销售额和利润。我需要像下面这样的 table。

product_id SALES_L13 SALES_L26 SALES_L52 PROFIT_L13 PROFIT_L26 PROFIT_L52
x1 0 0 0
x2 0 0 0 0 0 0

我正在使用雪花来编写查询。我尝试使用 snowflake 的 pivot 函数,但我只能指定一个聚合函数。

任何人都可以帮助我实现这个解决方案吗?

感谢任何帮助。

谢谢

使用条件聚合:

SELECT product_id
   ,SUM(CASE WHEN Period = 'L13' THEN Sales END) AS SALES_L13
   ,SUM(CASE WHEN Period = 'L26' THEN Sales END) AS SALES_L26
   ,SUM(CASE WHEN Period = 'L52' THEN Sales END) AS SALES_L52
   ,SUM(CASE WHEN Period = 'L13' THEN Profit END) AS PROFIT_L52
   ,SUM(CASE WHEN Period = 'L26' THEN Profit END) AS PROFIT_L52
   ,SUM(CASE WHEN Period = 'L52' THEN Profit END) AS PROFIT_L52
FROM tab
GROUP BY product_id

我相信您一次只能有一个枢轴,但您可以通过 运行ning 下面的第一个代码进行检查。然后你可以单独运行只用一个pivot,看看它是否工作正常。不幸的是,如果不允许多个枢轴,即第一个代码,那么您可以使用第三个代码,即当方法或首先使用 union 组合它们时的情况,即(上面的 Phil Culson 方法)。

 select * 
      from [table name]
        pivot(sum(amount) for PERIOD in (L13, L26, L52)),
        pivot(sum(profit) for PERIOD in (L13, L26, L52))
      order by product_id;

如果上述方法不起作用,请尝试使用以下一种方法: https://count.co/sql-resources/snowflake/pivot-tables

  select * 
      from [table name]
        pivot(sum(amount) for PERIOD in (L13, L26, L52))
      order by product_id;

否则您将不得不应用手动 case when 逻辑:

select 
product_id,
sum(case when Period = 'L13' then Sales end)  as sales_l13,
sum(case when Period = 'L26' then Sales end)  as  sales_l26,
sum(case when Period = 'L52' then Sales end)  as  sales_l52,
sum(case when Period = 'L13' then Profit end) as  profi_l13,
sum(case when Period = 'L26' then Profit end) as  profit_l26,
sum(case when Period = 'L52' then Profit end) as  profit_l52
from [table name]
group by 1 

我们在转向之前叠加销售额和利润怎么样?我会把它留给你来修复我弄乱的列名。

with cte (product_id, period, amount) as
  
(select product_id, period||'_profit', profit from t
 union all
 select product_id, period||'_sales', sales from t)
   
select * 
from cte
     pivot(max(amount) for period in ('L13_sales','L26_sales','L52_sales','L13_profit','L26_profit','L52_profit'))
     as p (product_id,L13_sales,L26_sales,L52_sales,L13_profit,L26_profit,L52_profit);

如果您希望为销售额和利润透视周期两次,则需要复制该列,以便每个透视实例都有一个。显然,由于在第一个数据透视表之后仍然存在重复列,这将创建空值。为了解决这个问题,我们可以在最后的 select 中使用 max。这是实现的样子

select product_id, 
       max(L13_sales) as L13_sales, 
       max(L26_sales) as L26_sales, 
       max(L52_sales) as L52_sales, 
       max(L13_profit) as L13_profit, 
       max(L26_profit) as L26_profit, 
       max(L52_profit) as L52_profit
from (select *, period as period2 from t) t
      pivot(max(sales) for period in ('L13','L26','L52'))
      pivot(max(profit) for period2 in ('L13','L26','L52'))  
      as p (product_id, L13_sales,L26_sales,L52_sales,L13_profit,L26_profit,L52_profit)
group by product_id;

至此,眼前一亮。您也可以使用 conditional aggregation 或更好的方法,在报告应用程序中处理数据透视。 conditional aggregation 的更紧凑替代方案使用 decode

select product_id,
       max(decode(period,'L13',sales)) as L13_sales,
       max(decode(period,'L26',sales)) as L26_sales,
       max(decode(period,'L52',sales)) as L52_sales,
       max(decode(period,'L13',profit)) as L13_profit,
       max(decode(period,'L26',profit)) as L26_profit,
       max(decode(period,'L52',profit)) as L52_profit
from t
group by product_id;

我对这个答案不是 100% 满意......很确定有人可以改进这种方法。

基本上PIVOTING an ARRAY ... the list of aggregation functions available to an ARRAY is not huge ... there's just one ARRAY_AGG. And PIVOT只支持AVG、COUNT、MAX、MIN和SUM。所以这不应该工作......它确实像我认为的那样 PIVOT 只需要某种聚合。

我建议在构建 ARRAY 之前聚合您的指标...但确实可以让您一次旋转多个指标 - 从阅读 Stack Overflow 来看应该是不可能的!

复制|粘贴|运行| .. 请改进 :-)

WITH CTE AS( SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100 SALES,10 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200 SALES,20 PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300 SALES,30 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500 SALES,110 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600 SALES,120 PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700 SALES,130 PROFIT)


SELECT 
PRODUCT_ID
,"'L13'"[0][0] SALES_L13 
,"'L13'"[0][1] PROFIT_L13 
,"'L26'"[0][0] SALES_L26 
,"'L26'"[0][1] PROFIT_L26 
,"'L52'"[0][0] SALES_L52 
,"'L52'"[0][1] PROFIT_L52 
FROM 
(SELECT * FROM 
   (
   SELECT PRODUCT_ID, PERIOD,ARRAY_CONSTRUCT(SALES,PROFIT) S FROM CTE)
   PIVOT (ARRAY_AGG(S) FOR PERIOD IN ('L13','L26','L52')
   ) 
 )  

聚合示例(将 1700,1130 添加到 L52 X2)

WITH CTE AS(
  SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100  SALES,10   PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200  SALES,20   PROFIT
UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300  SALES,30   PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500  SALES,110  PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600  SALES,120  PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700  SALES,130  PROFIT
UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,1700 SALES,1130 PROFIT)

SELECT 
    PRODUCT_ID
    ,"'L13'"[0][0] SALES_L13 
    ,"'L13'"[0][1] PROFIT_L13 
    ,"'L26'"[0][0] SALES_L26 
    ,"'L26'"[0][1] PROFIT_L26 
    ,"'L52'"[0][0] SALES_L52 
    ,"'L52'"[0][1] PROFIT_L52 
FROM 
   (SELECT * FROM 
   (
   SELECT PRODUCT_ID, PERIOD,ARRAY_CONSTRUCT(SUM(SALES),SUM(PROFIT)) S FROM CTE GROUP BY 1,2)
   PIVOT (ARRAY_AGG(S) FOR PERIOD IN ('L13','L26','L52')
   ) 
)  

这是使用 OBJECT_AGGLATERAL FLATTEN 的替代形式,它避免了 Adrian White 提出的 PIVOTARRAY_AGG 的潜在支持问题。

这应该适用于 OBJ_TALL CTE 中初始 ARRAY_CONSTRUCT 中包含的多个输入列上的任何聚合。我希望带有 CASE 语句的条件聚合选项会更快,但您需要进行大规模测试才能看到。

-- OBJECT FORM USING LATERAL FLATTEN 
WITH CTE AS(
                   SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100  SALES,10   PROFIT
             UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200  SALES,20   PROFIT
             UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300  SALES,30   PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500  SALES,110  PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600  SALES,120  PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700  SALES,130  PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,1700 SALES,1130 PROFIT)
,OBJ_TALL AS (  SELECT PRODUCT_ID, 
                OBJECT_CONSTRUCT(PERIOD,
                                 ARRAY_CONSTRUCT(  SUM(SALES)
                                                  ,SUM(PROFIT)
                                                )
                                 ) S 
                  FROM CTE 
              GROUP BY PRODUCT_ID, PERIOD)
 SELECT * FROM OBJ_TALL;
,OBJ_WIDE AS (  SELECT PRODUCT_ID, OBJECT_AGG(KEY,VALUE) OA 
                  FROM OBJ_TALL, LATERAL FLATTEN(INPUT => S) 
              GROUP BY PRODUCT_ID)
-- SELECT * FROM OBJ_WIDE;
SELECT 
    PRODUCT_ID
    ,OA:L13[0] SALES_L13 
    ,OA:L13[1] PROFIT_L13 
    ,OA:L26[0] SALES_L26 
    ,OA:L26[1] PROFIT_L26 
    ,OA:L52[0] SALES_L52 
    ,OA:L52[1] PROFIT_L52 
FROM OBJ_WIDE
ORDER BY 1;

为了方便与上面的比较,这里是使用 CTE 重新格式化的 Adrians ARRAY_AGGPIVOT 版本。

-- ARRAY FORM - RE-WRITTEN WITH CTES FOR CLARITY AND COMPARISON TO OBJECT FORM
WITH CTE AS(
                   SELECT 'X1' PRODUCT_ID,'L13' PERIOD,100  SALES,10   PROFIT
             UNION SELECT 'X1' PRODUCT_ID,'L26' PERIOD,200  SALES,20   PROFIT
             UNION SELECT 'X1' PRODUCT_ID,'L52' PERIOD,300  SALES,30   PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L13' PERIOD,500  SALES,110  PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L26' PERIOD,600  SALES,120  PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,700  SALES,130  PROFIT
             UNION SELECT 'X2' PRODUCT_ID,'L52' PERIOD,1700 SALES,1130 PROFIT)
,ARR_TALL AS (SELECT PRODUCT_ID, 
                     PERIOD,
                     ARRAY_CONSTRUCT( SUM(SALES)
                                     ,SUM(PROFIT)
                                    ) S 
                FROM CTE GROUP BY 1,2)
,ARR_WIDE AS (SELECT * 
                FROM ARR_TALL PIVOT (ARRAY_AGG(S) FOR PERIOD IN ('L13','L26','L52')  )  )
SELECT 
    PRODUCT_ID
    ,"'L13'"[0][0] SALES_L13 
    ,"'L13'"[0][1] PROFIT_L13 
    ,"'L26'"[0][0] SALES_L26 
    ,"'L26'"[0][1] PROFIT_L26 
    ,"'L52'"[0][0] SALES_L52 
    ,"'L52'"[0][1] PROFIT_L52 
FROM ARR_WIDE
ORDER BY 1;