将数据组织成多列 SQL
Organizing data into multiple columns SQL
我有一个 table 超过 500 万客户的历史 activity 数据,如下例所示:
Customer ID
PART_ID
Activity
12345
202012
2
12345
202101
0
12345
202102
5
我想将这些数据转换成多列;行中的客户、列中的日期及其各自的 activity 信息。
我写了下面的代码,但不是为单个客户创建单行,而是客户重复,我得到这样的 table:
Customer ID
202012
202101
202102
12345
1
0
0
12345
0
0
0
12345
0
0
1
而不是:
Customer ID
202012
202101
202102
12345
1
0
1
我做错了什么?
SELECT *
FROM
(
SELECT CUST_ID, RULLED_PROFIT_CENTER,
CASE WHEN PART_ID = 202012 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS ARA_20,
CASE WHEN PART_ID = 202101 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS OCA_21,
CASE WHEN PART_ID = 202102 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS SUB_21,
CASE WHEN PART_ID = 202103 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS MAR_21,
CASE WHEN PART_ID = 202104 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS NIS_21,
CASE WHEN PART_ID = 202105 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS MAY_21,
CASE WHEN PART_ID = 202106 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS HAZ_21,
CASE WHEN PART_ID = 202107 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS TEM_21,
CASE WHEN PART_ID = 202108 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS AGU_21,
CASE WHEN PART_ID = 202109 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS EYL_21
FROM ACTIVITY
WHERE RULLED_PROFIT_CENTER IN (108, 103, 170)
GROUP BY CUST_ID, RULLED_PROFIT_CENTER
)
WHERE ARA_20 + OCA_21 + SUB_21 + MAR_21 + NIS_21 + MAY_21 +
HAZ_21 + TEM_21 + AGU_21 + EYL_21 > 0
WHERE RULLED_PROFIT_CENTER IN (108, 103, 170)
GROUP BY CUST_ID, RULLED_PROFIT_CENTER
您按两个值分组,客户和 RULLED_PROFIT_CENTER。 RULLED_PROFIT_CENTER 可以有三个可能的值,因此每个客户最多可以得到三行。如果您只希望每个客户一行,请从您的组中删除 RULLED_PROFIT_CENTER。
您可能将 RULLED_PROFIT_CENTER 添加到分组依据,因为没有它查询将无法运行。如果你想在你的 select 中包含 RULLED_PROFIT_CENTER,而不是你的分组依据,你还需要聚合 RULLED_PROFIT_CENTER。即使每个客户只有一个价值。使用 string_agg
.
SELECT CUST_ID, string_agg(RULLED_PROFIT_CENTER, ', ')
如果客户对 RULLED_PROFIT_CENTER 只有一个价值,您将只获得那个价值。
由于您将客户的所有行分组在一起,因此您需要获取活动的 max
。
MAX( CASE WHEN PART_ID = 202012 AND ACTIVITY > 0 THEN 1 ELSE 0 END ) AS ARA_20
我有一个 table 超过 500 万客户的历史 activity 数据,如下例所示:
Customer ID | PART_ID | Activity |
---|---|---|
12345 | 202012 | 2 |
12345 | 202101 | 0 |
12345 | 202102 | 5 |
我想将这些数据转换成多列;行中的客户、列中的日期及其各自的 activity 信息。
我写了下面的代码,但不是为单个客户创建单行,而是客户重复,我得到这样的 table:
Customer ID | 202012 | 202101 | 202102 |
---|---|---|---|
12345 | 1 | 0 | 0 |
12345 | 0 | 0 | 0 |
12345 | 0 | 0 | 1 |
而不是:
Customer ID | 202012 | 202101 | 202102 |
---|---|---|---|
12345 | 1 | 0 | 1 |
我做错了什么?
SELECT *
FROM
(
SELECT CUST_ID, RULLED_PROFIT_CENTER,
CASE WHEN PART_ID = 202012 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS ARA_20,
CASE WHEN PART_ID = 202101 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS OCA_21,
CASE WHEN PART_ID = 202102 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS SUB_21,
CASE WHEN PART_ID = 202103 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS MAR_21,
CASE WHEN PART_ID = 202104 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS NIS_21,
CASE WHEN PART_ID = 202105 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS MAY_21,
CASE WHEN PART_ID = 202106 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS HAZ_21,
CASE WHEN PART_ID = 202107 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS TEM_21,
CASE WHEN PART_ID = 202108 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS AGU_21,
CASE WHEN PART_ID = 202109 AND ACTIVITY > 0 THEN 1 ELSE 0 END AS EYL_21
FROM ACTIVITY
WHERE RULLED_PROFIT_CENTER IN (108, 103, 170)
GROUP BY CUST_ID, RULLED_PROFIT_CENTER
)
WHERE ARA_20 + OCA_21 + SUB_21 + MAR_21 + NIS_21 + MAY_21 +
HAZ_21 + TEM_21 + AGU_21 + EYL_21 > 0
WHERE RULLED_PROFIT_CENTER IN (108, 103, 170)
GROUP BY CUST_ID, RULLED_PROFIT_CENTER
您按两个值分组,客户和 RULLED_PROFIT_CENTER。 RULLED_PROFIT_CENTER 可以有三个可能的值,因此每个客户最多可以得到三行。如果您只希望每个客户一行,请从您的组中删除 RULLED_PROFIT_CENTER。
您可能将 RULLED_PROFIT_CENTER 添加到分组依据,因为没有它查询将无法运行。如果你想在你的 select 中包含 RULLED_PROFIT_CENTER,而不是你的分组依据,你还需要聚合 RULLED_PROFIT_CENTER。即使每个客户只有一个价值。使用 string_agg
.
SELECT CUST_ID, string_agg(RULLED_PROFIT_CENTER, ', ')
如果客户对 RULLED_PROFIT_CENTER 只有一个价值,您将只获得那个价值。
由于您将客户的所有行分组在一起,因此您需要获取活动的 max
。
MAX( CASE WHEN PART_ID = 202012 AND ACTIVITY > 0 THEN 1 ELSE 0 END ) AS ARA_20