SQL 查询根据 ID 将多行合并为一行,同时将其他值保留在同一行中?
SQL Query to combine multiple rows into one based on ID while keeping other value in the same row?
首先,我已经搜索了一段时间了。
我有一个 table 看起来有点像这样 :
ID Expenditure MonthYear
1A 1,000 122019
1A 1,500 012020
1B 1,900 122019
1C 2,400 122019
1B 2,400 012020
1C 900 012020
1A 800 022020
由于行数可以达到几千行,有些ID重复了几十次,我想把不同ID的合并成一行,并添加保留其中所有信息的列。我想让 table 看起来像这样:
ID Expenditure_1 MonthYear_1 Expenditure_2 MonthYear_2 Expenditure_3 MonthYear_3
1A 1,000 122019 1,500 012020 800 022020
1B 1,900 122019 2,400 012020 Null Null
1C 2,400 122019 900 012020 Null Null
在 Impala 上使用 SQL 解决此问题的最佳方法是什么?
谢谢。
您可以使用条件聚合和 row_number():
select id,
max(case when seqnum = 1 then expenditure end) as expenditure_1,
max(case when seqnum = 1 then monthyear end) as monthyear_1,
max(case when seqnum = 2 then expenditure end) as expenditure_2,
max(case when seqnum = 2 then monthyear end) as monthyear_2,
max(case when seqnum = 3 then expenditure end) as expenditure_3,
max(case when seqnum = 3 then monthyear end) as monthyear_3
from (select t.*,
row_number() over (partition by id order by right(monthyear, 4), left(monthyear, 2)) as seqnum
from t
) t
group by id;
首先,我已经搜索了一段时间了。
我有一个 table 看起来有点像这样 :
ID Expenditure MonthYear
1A 1,000 122019
1A 1,500 012020
1B 1,900 122019
1C 2,400 122019
1B 2,400 012020
1C 900 012020
1A 800 022020
由于行数可以达到几千行,有些ID重复了几十次,我想把不同ID的合并成一行,并添加保留其中所有信息的列。我想让 table 看起来像这样:
ID Expenditure_1 MonthYear_1 Expenditure_2 MonthYear_2 Expenditure_3 MonthYear_3
1A 1,000 122019 1,500 012020 800 022020
1B 1,900 122019 2,400 012020 Null Null
1C 2,400 122019 900 012020 Null Null
在 Impala 上使用 SQL 解决此问题的最佳方法是什么? 谢谢。
您可以使用条件聚合和 row_number():
select id,
max(case when seqnum = 1 then expenditure end) as expenditure_1,
max(case when seqnum = 1 then monthyear end) as monthyear_1,
max(case when seqnum = 2 then expenditure end) as expenditure_2,
max(case when seqnum = 2 then monthyear end) as monthyear_2,
max(case when seqnum = 3 then expenditure end) as expenditure_3,
max(case when seqnum = 3 then monthyear end) as monthyear_3
from (select t.*,
row_number() over (partition by id order by right(monthyear, 4), left(monthyear, 2)) as seqnum
from t
) t
group by id;