SQL/Bigquery:将行的组合旋转成列,保留所有对
SQL/Bigquery: Pivot combinations of rows into columns, keeping all pairs
假设我有一个 table 像:
| id | brand | fuel | mpg |
|:--:|:------:|:------:|:---:|
| 1 | ford | diesel | 14 |
| 1 | ford | gas | 20 |
| 1 | toyota | diesel | 30 |
| 1 | toyota | gas | 35 |
并且我希望对列进行旋转以使结果如下:
| id | ford | toyota | toyota_mpg | ford_mpg |
|:--:|:------:|--------|:----------:|:--------:|
| 1 | diesel | diesel | 30 | 14 |
| 1 | gas | gas | 35 | 20 |
| 1 | diesel | gas | 35 | 14 |
| 1 | gas | diesel | 30 | 20 |
到目前为止,我有
SELECT id,
MAX(CASE WHEN end_use = 'ford' THEN fuel ELSE NULL END) ford,
SUM(CASE WHEN end_use = 'ford' THEN mpg ELSE NULL END) ford_mpg,
MAX(CASE WHEN end_use = 'toyota' THEN fuel ELSE NULL END) toyota,
SUM(CASE WHEN end_use = 'toyota' THEN mpg ELSE NULL END) toyota_mpg,
FROM table GROUP BY id, fuel
下面的结果,给出了燃料对齐时的正确结果:
| id | ford | toyota | toyota_mpg | ford_mpg |
|:--:|:------:|--------|:----------:|:--------:|
| 1 | diesel | diesel | 30 | 14 |
| 1 | gas | gas | 35 | 20 |
但我无法获得燃料的组合(它们不匹配)。
试试下面
select id,
t1.fuel ford,
t2.fuel toyota,
t1.mpg ford_mpg,
t2.mpg toyota_mpg
from data t1
join data t2
using (id)
where t1.brand < t2.brand
如果应用于您问题中的示例数据 - 输出为
您可以从大查询中探索 pivot operator。虽然我不确定您在 id 级别上找到 sum/avg/max/min 的用例!
WITH
result AS (
SELECT
1 AS id,
'ford' AS brand,
'diesel' AS fuel,
14 AS mpg
UNION ALL
SELECT
1 AS id,
'ford' AS brand,
'gas' AS fuel,
20 AS mpg
UNION ALL
SELECT
1 AS id,
'toyota' AS brand,
'diesel' AS fuel,
30 AS mpg
UNION ALL
SELECT
1 AS id,
'toyota' AS brand,
'gas' AS fuel,
35 AS mpg ),
pivot_result AS (
SELECT
id,
ford,
toyota,
mpg_ford,
mpg_toyota
FROM (
SELECT
*
FROM (
SELECT
id,
fuel,
brand brand_,
brand,
mpg
FROM
result ) PIVOT ( AVG(mpg) mpg FOR brand IN ('ford',
'toyota')) ) PIVOT (MAX(fuel) FOR brand_ IN ('ford',
'toyota')) )
SELECT
f.id,
f.ford,
t.toyota,
t.mpg_toyota,
f.mpg_ford
FROM
pivot_result f
INNER JOIN (
SELECT
id,
toyota,
mpg_toyota
FROM
pivot_result) t
ON
t.id = f.id
WHERE
(f.ford IS NOT NULL
AND f.mpg_ford IS NOT NULL
AND t.toyota IS NOT NULL
AND t.mpg_toyota IS NOT NULL)
GROUP BY
1,
2,
3,
4,
5
假设我有一个 table 像:
| id | brand | fuel | mpg |
|:--:|:------:|:------:|:---:|
| 1 | ford | diesel | 14 |
| 1 | ford | gas | 20 |
| 1 | toyota | diesel | 30 |
| 1 | toyota | gas | 35 |
并且我希望对列进行旋转以使结果如下:
| id | ford | toyota | toyota_mpg | ford_mpg |
|:--:|:------:|--------|:----------:|:--------:|
| 1 | diesel | diesel | 30 | 14 |
| 1 | gas | gas | 35 | 20 |
| 1 | diesel | gas | 35 | 14 |
| 1 | gas | diesel | 30 | 20 |
到目前为止,我有
SELECT id,
MAX(CASE WHEN end_use = 'ford' THEN fuel ELSE NULL END) ford,
SUM(CASE WHEN end_use = 'ford' THEN mpg ELSE NULL END) ford_mpg,
MAX(CASE WHEN end_use = 'toyota' THEN fuel ELSE NULL END) toyota,
SUM(CASE WHEN end_use = 'toyota' THEN mpg ELSE NULL END) toyota_mpg,
FROM table GROUP BY id, fuel
下面的结果,给出了燃料对齐时的正确结果:
| id | ford | toyota | toyota_mpg | ford_mpg |
|:--:|:------:|--------|:----------:|:--------:|
| 1 | diesel | diesel | 30 | 14 |
| 1 | gas | gas | 35 | 20 |
但我无法获得燃料的组合(它们不匹配)。
试试下面
select id,
t1.fuel ford,
t2.fuel toyota,
t1.mpg ford_mpg,
t2.mpg toyota_mpg
from data t1
join data t2
using (id)
where t1.brand < t2.brand
如果应用于您问题中的示例数据 - 输出为
您可以从大查询中探索 pivot operator。虽然我不确定您在 id 级别上找到 sum/avg/max/min 的用例!
WITH
result AS (
SELECT
1 AS id,
'ford' AS brand,
'diesel' AS fuel,
14 AS mpg
UNION ALL
SELECT
1 AS id,
'ford' AS brand,
'gas' AS fuel,
20 AS mpg
UNION ALL
SELECT
1 AS id,
'toyota' AS brand,
'diesel' AS fuel,
30 AS mpg
UNION ALL
SELECT
1 AS id,
'toyota' AS brand,
'gas' AS fuel,
35 AS mpg ),
pivot_result AS (
SELECT
id,
ford,
toyota,
mpg_ford,
mpg_toyota
FROM (
SELECT
*
FROM (
SELECT
id,
fuel,
brand brand_,
brand,
mpg
FROM
result ) PIVOT ( AVG(mpg) mpg FOR brand IN ('ford',
'toyota')) ) PIVOT (MAX(fuel) FOR brand_ IN ('ford',
'toyota')) )
SELECT
f.id,
f.ford,
t.toyota,
t.mpg_toyota,
f.mpg_ford
FROM
pivot_result f
INNER JOIN (
SELECT
id,
toyota,
mpg_toyota
FROM
pivot_result) t
ON
t.id = f.id
WHERE
(f.ford IS NOT NULL
AND f.mpg_ford IS NOT NULL
AND t.toyota IS NOT NULL
AND t.mpg_toyota IS NOT NULL)
GROUP BY
1,
2,
3,
4,
5