进行条件聚合
Making a conditional aggregate
由于我们的业务原因,我遇到了棘手的分组问题,我有一个 table 具有这样的值
----------------------------
| NAME | TYPE | VALUE |
----------------------------
| N1 | T1 | V1 |
| N1 | T2 | V2 |
| N1 | NULL | V3 |
| N2 | T2 | V4 |
| N2 | NULL | V5 |
| N3 | NULL | V6 |
-----------------------------
我需要将其分组,
- 第一级分组将按名称。
- 在第二层,
- 当可用类型为T1、T2和NULL时,将T1和NULL组合在一起,将T2单独分组。
- 当可用类型为T2和NULL时,将NULL与T2分组。
- 当 NULL 是唯一可用的类型时,保持原样。
上述 table 的预期 O/P 是,
----------------------------
| N1 | T1 | V1+V3 |
| N1 | T2 | V2 |
| N2 | T2 | V4+V5 |
| N3 | NULL | V6 |
-----------------------------
如何在雪花中实现这一点 sql。或任何其他服务器,以便我可以在 Snowflake 中找到等效项。
以下查询应该有效:
SELECT t1.NAME, COALESCE(TYPE, MIN_TYPE), SUM(VALUE)
FROM mytable AS t1
JOIN (
SELECT NAME, MIN(TYPE) AS MIN_TYPE
FROM mytable
GROUP BY NAME
) AS t2 ON t1.NAME = t2.NAME
GROUP BY t1.NAME, COALESCE(TYPE, MIN_TYPE)
查询使用派生的 table 来提取每个 NAME
的 MIN(TYPE)
值。使用 COALESCE
我们可以将 NULL
转换为 T1
或 T2
.
编辑:
您可以使用以下查询创建预期结果集的透视版本:
SELECT NAME,
CASE
WHEN T1SUM IS NULL THEN 0
ELSE COALESCE(T1SUM, 0) + COALESCE(NULLSUM,0)
END AS T1SUM,
CASE
WHEN T1SUM IS NULL AND T2SUM IS NOT NULL
THEN COALESCE(T2SUM, 0) + COALESCE(NULLSUM,0)
ELSE COALESCE(T2SUM, 0)
END AS T2SUM,
CASE
WHEN T1SUM IS NULL AND T2SUM IS NULL THEN COALESCE(NULLSUM,0)
ELSE 0
END AS NULLSUM
FROM (
SELECT NAME,
SUM(CASE WHEN TYPE = 'T1' THEN VALUE END) AS T1SUM,
SUM(CASE WHEN TYPE = 'T2' THEN VALUE END) AS T2SUM,
SUM(CASE WHEN TYPE IS NULL THEN VALUE END) AS NULLSUM
FROM mytable
GROUP BY NAME) AS t
所以在 Giorgos 的回答中,总计以旋转的形式给出,或者单行 be case 形式,每个案例行数不多,这可以写得更简单:
使用此数据:
WITH data_table(name, type, value) AS (
SELECT * FROM VALUES
(10, 1, 100 ),
(10, 2, 200 ),
(10, null, 400 ),
(11, 2, 100 ),
(11, null, 200 ),
(12, null, 100 )
)
和这个SQL
SELECT name
,SUM(IFF(type=1, value, null)) as t1_val
,SUM(IFF(type=2, value, null)) as t2_val
,SUM(IFF(type is null, value, null)) as tnull_val
,IFF(t1_val is not null, t1_val + zeroifnull(tnull_val), null) as c1_sum
,IFF(t1_val is not null, t2_val, t2_val + zeroifnull(tnull_val)) as c2_sum
,IFF(t1_val is null AND t2_val is null, tnull_val, null) as c3_sum
FROM data_table
GROUP BY 1;
我们得到:
NAME
T1_VAL
T2_VAL
TNULL_VAL
C1_SUM
C2_SUM
C3_SUM
10
100
200
400
500
200
null
11
null
100
200
null
300
null
12
null
null
100
null
null
100
显示 10
行的空值和与 1 和绑定,对于 11
行,空值和与 2 和绑定,在 12
行中我们自己得到零和。
如果我们愿意,我们可以对这些值进行反透视,但是加入一个有 3 行的迷你 table,如下所示:
SELECT d.name,
p.c2 as type,
case p.c1
WHEN 1 then d.c1_sum
WHEN 2 then d.c2_sum
ELSE d.c3_sum
end as value
FROM (
SELECT name
,SUM(IFF(type=1, value, null)) as t1_val
,SUM(IFF(type=2, value, null)) as t2_val
,SUM(IFF(type is null, value, null)) as tnull_val
,IFF(t1_val is not null, t1_val + zeroifnull(tnull_val), null) as c1_sum
,IFF(t1_val is not null, t2_val, t2_val + zeroifnull(tnull_val)) as c2_sum
,IFF(t1_val is null AND t2_val is null, tnull_val, null) as c3_sum
FROM data_table
GROUP BY 1
) AS d
JOIN (
SELECT column1 as c1, column2 as c2
FROM VALUES (1,'T1'),(2,'T2'),(null,'null')
) AS p
ON ((d.c1_sum is not null AND p.c1 = 1)
OR (d.c2_sum is not null AND p.c1 = 2)
OR (d.c3_sum is not null AND p.c1 is null))
ORDER BY 1,2;
它给出了原始请求的输出:
NAME
TYPE
VALUE
10
T1
500
10
T2
200
11
T2
300
12
null
100
由于我们的业务原因,我遇到了棘手的分组问题,我有一个 table 具有这样的值
---------------------------- | NAME | TYPE | VALUE | ---------------------------- | N1 | T1 | V1 | | N1 | T2 | V2 | | N1 | NULL | V3 | | N2 | T2 | V4 | | N2 | NULL | V5 | | N3 | NULL | V6 | -----------------------------
我需要将其分组,
- 第一级分组将按名称。
- 在第二层,
- 当可用类型为T1、T2和NULL时,将T1和NULL组合在一起,将T2单独分组。
- 当可用类型为T2和NULL时,将NULL与T2分组。
- 当 NULL 是唯一可用的类型时,保持原样。
上述 table 的预期 O/P 是,
---------------------------- | N1 | T1 | V1+V3 | | N1 | T2 | V2 | | N2 | T2 | V4+V5 | | N3 | NULL | V6 | -----------------------------
如何在雪花中实现这一点 sql。或任何其他服务器,以便我可以在 Snowflake 中找到等效项。
以下查询应该有效:
SELECT t1.NAME, COALESCE(TYPE, MIN_TYPE), SUM(VALUE)
FROM mytable AS t1
JOIN (
SELECT NAME, MIN(TYPE) AS MIN_TYPE
FROM mytable
GROUP BY NAME
) AS t2 ON t1.NAME = t2.NAME
GROUP BY t1.NAME, COALESCE(TYPE, MIN_TYPE)
查询使用派生的 table 来提取每个 NAME
的 MIN(TYPE)
值。使用 COALESCE
我们可以将 NULL
转换为 T1
或 T2
.
编辑:
您可以使用以下查询创建预期结果集的透视版本:
SELECT NAME,
CASE
WHEN T1SUM IS NULL THEN 0
ELSE COALESCE(T1SUM, 0) + COALESCE(NULLSUM,0)
END AS T1SUM,
CASE
WHEN T1SUM IS NULL AND T2SUM IS NOT NULL
THEN COALESCE(T2SUM, 0) + COALESCE(NULLSUM,0)
ELSE COALESCE(T2SUM, 0)
END AS T2SUM,
CASE
WHEN T1SUM IS NULL AND T2SUM IS NULL THEN COALESCE(NULLSUM,0)
ELSE 0
END AS NULLSUM
FROM (
SELECT NAME,
SUM(CASE WHEN TYPE = 'T1' THEN VALUE END) AS T1SUM,
SUM(CASE WHEN TYPE = 'T2' THEN VALUE END) AS T2SUM,
SUM(CASE WHEN TYPE IS NULL THEN VALUE END) AS NULLSUM
FROM mytable
GROUP BY NAME) AS t
所以在 Giorgos 的回答中,总计以旋转的形式给出,或者单行 be case 形式,每个案例行数不多,这可以写得更简单:
使用此数据:
WITH data_table(name, type, value) AS (
SELECT * FROM VALUES
(10, 1, 100 ),
(10, 2, 200 ),
(10, null, 400 ),
(11, 2, 100 ),
(11, null, 200 ),
(12, null, 100 )
)
和这个SQL
SELECT name
,SUM(IFF(type=1, value, null)) as t1_val
,SUM(IFF(type=2, value, null)) as t2_val
,SUM(IFF(type is null, value, null)) as tnull_val
,IFF(t1_val is not null, t1_val + zeroifnull(tnull_val), null) as c1_sum
,IFF(t1_val is not null, t2_val, t2_val + zeroifnull(tnull_val)) as c2_sum
,IFF(t1_val is null AND t2_val is null, tnull_val, null) as c3_sum
FROM data_table
GROUP BY 1;
我们得到:
NAME | T1_VAL | T2_VAL | TNULL_VAL | C1_SUM | C2_SUM | C3_SUM |
---|---|---|---|---|---|---|
10 | 100 | 200 | 400 | 500 | 200 | null |
11 | null | 100 | 200 | null | 300 | null |
12 | null | null | 100 | null | null | 100 |
显示 10
行的空值和与 1 和绑定,对于 11
行,空值和与 2 和绑定,在 12
行中我们自己得到零和。
如果我们愿意,我们可以对这些值进行反透视,但是加入一个有 3 行的迷你 table,如下所示:
SELECT d.name,
p.c2 as type,
case p.c1
WHEN 1 then d.c1_sum
WHEN 2 then d.c2_sum
ELSE d.c3_sum
end as value
FROM (
SELECT name
,SUM(IFF(type=1, value, null)) as t1_val
,SUM(IFF(type=2, value, null)) as t2_val
,SUM(IFF(type is null, value, null)) as tnull_val
,IFF(t1_val is not null, t1_val + zeroifnull(tnull_val), null) as c1_sum
,IFF(t1_val is not null, t2_val, t2_val + zeroifnull(tnull_val)) as c2_sum
,IFF(t1_val is null AND t2_val is null, tnull_val, null) as c3_sum
FROM data_table
GROUP BY 1
) AS d
JOIN (
SELECT column1 as c1, column2 as c2
FROM VALUES (1,'T1'),(2,'T2'),(null,'null')
) AS p
ON ((d.c1_sum is not null AND p.c1 = 1)
OR (d.c2_sum is not null AND p.c1 = 2)
OR (d.c3_sum is not null AND p.c1 is null))
ORDER BY 1,2;
它给出了原始请求的输出:
NAME | TYPE | VALUE |
---|---|---|
10 | T1 | 500 |
10 | T2 | 200 |
11 | T2 | 300 |
12 | null | 100 |