如何使用 PostgreSQL(交叉表)使类别列的值密集 table?
How to get values to dense table with columns of categories using PostgreSQL (crosstab)?
我有这个玩具示例,它给了我稀疏的 table 值,这些值在不同的类别中分开。我想要密集矩阵,其中所有列都是单独排序的。
drop table if exists temp_table;
create temp table temp_table(
rowid int
, category text
, score int
);
insert into temp_table values (0, 'cat1', 10);
insert into temp_table values (1, 'cat2', 21);
insert into temp_table values (2, 'cat3', 32);
insert into temp_table values (3, 'cat2', 23);
insert into temp_table values (4, 'cat2', 24);
insert into temp_table values (5, 'cat3', 35);
insert into temp_table values (6, 'cat1', 16);
insert into temp_table values (7, 'cat1', 17);
insert into temp_table values (8, 'cat2', 28);
insert into temp_table values (9, 'cat2', 29);
这给了这个临时 table:
rowid
category
score
0
cat1
10
1
cat2
21
2
cat3
32
3
cat2
23
4
cat2
24
5
cat3
35
6
cat1
16
7
cat1
17
8
cat2
28
9
cat2
29
然后根据类别将分值排序到不同的列:
select "cat1", "cat2", "cat3"
from crosstab(
$$ select rowid, category, score from temp_table $$ -- as source_sql
, $$ select distinct category from temp_table order by category $$ -- as category_sql
) as (rowid int, "cat1" int, "cat2" int, "cat3" int)
输出:
cat1
cat2
cat3
10
21
32
23
24
35
16
17
28
29
但我希望查询的结果是密集的,例如:
cat1
cat2
cat3
10
21
32
16
23
35
17
24
28
29
也许 PostgreSQL 的交叉表甚至不是执行此操作的正确工具,但我首先想到的是它生成的稀疏 table 接近我需要的结果。
这应该适用于确切的给定示例数据和预期输出。
select max(cat1), max(cat2), max(cat3)
from crosstab(
$$ select rank() over(partition by category order by rowid) as ranking,
rowid,
category,
score
from temp_table
order by rowid, category asc$$ -- as source_sql
, $$ select distinct category
from temp_table
order by category $$ -- as category_sql
) as (ranking int, rowid int, "cat1" int, "cat2" int, "cat3" int)
group by ranking
order by ranking asc
您可以在此处测试解决方案 - https://dbfiddle.uk/?rdbms=postgres_14&fiddle=f198e40a18a282cc0d65fa6ecdf797cb
编辑:
对您的查询进行改进以得出解决方案:
- 在源 SQL 查询中,我根据 rowid 顺序对类别值进行了排名,这有助于根据您的要求“确定”预期值的顺序。
select rank() over(partition by category order by rowid) as ranking, rowid, category, score from temp_table order by rowid, category asc
- 在外部查询中,对于源 SQL 查询中获得的每个排名,我有效地选择了每个类别的
max()
值。
with cte as (
select category, score, row_number() over (
partition by category order by score
) as r
from temp_table
)
select
sum(score) filter (where category = 'cat1') as cat1,
sum(score) filter (where category = 'cat2') as cat2,
sum(score) filter (where category = 'cat3') as cat3
from cte
group by r
order by r
;
如果列数已知且相当小,FILTER
可能比 CROSSTAB
更好,后者需要扩展。
我有这个玩具示例,它给了我稀疏的 table 值,这些值在不同的类别中分开。我想要密集矩阵,其中所有列都是单独排序的。
drop table if exists temp_table;
create temp table temp_table(
rowid int
, category text
, score int
);
insert into temp_table values (0, 'cat1', 10);
insert into temp_table values (1, 'cat2', 21);
insert into temp_table values (2, 'cat3', 32);
insert into temp_table values (3, 'cat2', 23);
insert into temp_table values (4, 'cat2', 24);
insert into temp_table values (5, 'cat3', 35);
insert into temp_table values (6, 'cat1', 16);
insert into temp_table values (7, 'cat1', 17);
insert into temp_table values (8, 'cat2', 28);
insert into temp_table values (9, 'cat2', 29);
这给了这个临时 table:
rowid | category | score |
---|---|---|
0 | cat1 | 10 |
1 | cat2 | 21 |
2 | cat3 | 32 |
3 | cat2 | 23 |
4 | cat2 | 24 |
5 | cat3 | 35 |
6 | cat1 | 16 |
7 | cat1 | 17 |
8 | cat2 | 28 |
9 | cat2 | 29 |
然后根据类别将分值排序到不同的列:
select "cat1", "cat2", "cat3"
from crosstab(
$$ select rowid, category, score from temp_table $$ -- as source_sql
, $$ select distinct category from temp_table order by category $$ -- as category_sql
) as (rowid int, "cat1" int, "cat2" int, "cat3" int)
输出:
cat1 | cat2 | cat3 |
---|---|---|
10 | ||
21 | ||
32 | ||
23 | ||
24 | ||
35 | ||
16 | ||
17 | ||
28 | ||
29 |
但我希望查询的结果是密集的,例如:
cat1 | cat2 | cat3 |
---|---|---|
10 | 21 | 32 |
16 | 23 | 35 |
17 | 24 | |
28 | ||
29 |
也许 PostgreSQL 的交叉表甚至不是执行此操作的正确工具,但我首先想到的是它生成的稀疏 table 接近我需要的结果。
这应该适用于确切的给定示例数据和预期输出。
select max(cat1), max(cat2), max(cat3)
from crosstab(
$$ select rank() over(partition by category order by rowid) as ranking,
rowid,
category,
score
from temp_table
order by rowid, category asc$$ -- as source_sql
, $$ select distinct category
from temp_table
order by category $$ -- as category_sql
) as (ranking int, rowid int, "cat1" int, "cat2" int, "cat3" int)
group by ranking
order by ranking asc
您可以在此处测试解决方案 - https://dbfiddle.uk/?rdbms=postgres_14&fiddle=f198e40a18a282cc0d65fa6ecdf797cb
编辑: 对您的查询进行改进以得出解决方案:
- 在源 SQL 查询中,我根据 rowid 顺序对类别值进行了排名,这有助于根据您的要求“确定”预期值的顺序。
select rank() over(partition by category order by rowid) as ranking, rowid, category, score from temp_table order by rowid, category asc
- 在外部查询中,对于源 SQL 查询中获得的每个排名,我有效地选择了每个类别的
max()
值。
with cte as (
select category, score, row_number() over (
partition by category order by score
) as r
from temp_table
)
select
sum(score) filter (where category = 'cat1') as cat1,
sum(score) filter (where category = 'cat2') as cat2,
sum(score) filter (where category = 'cat3') as cat3
from cte
group by r
order by r
;
如果列数已知且相当小,FILTER
可能比 CROSSTAB
更好,后者需要扩展。