没有 crosstab/tablefunc 的枢轴 table
Pivot table without crosstab/tablefunc
我有一个table这样的
输入
id author size file_ext
--------------------------------
1 a 13661 python
1 a 13513 cpp
1 a 1211 non-code
2 b 1019 python
2 b 6881 cpp
2 b 1525 python
2 b 1208 non-code
3 c 1039 python
3 c 1299 cpp
我希望能够按以下方式旋转此 table
输出
id author size python cpp non-code
-------------------------------------------------
1 a 13661 1 0 0
1 a 13513 0 1 0
1 a 1211 0 0 1
2 b 1019 1 0 0
2 b 6881 0 1 0
2 b 1525 1 0 0
2 b 1208 0 0 1
3 c 1039 1 0 0
3 c 1299 0 1 0
我能在网上找到的所有文章 pivot tables 基于第二列。我的最终目标是为每个 ID 获取一条记录。
最终输出
id author size python cpp non-code
-------------------------------------------------
1 a 28385 1 1 1
2 b 10633 2 1 1
3 c 2338 1 1 0
此处汇总了大小、python、cpp、非代码列的值。
有条件聚合:
select
id, author,
sum(size) size,
sum((file_ext = 'python')::int) python,
sum((file_ext = 'cpp')::int) cpp,
sum((file_ext = 'non-code')::int) "non-code"
from tablename
group by id, author
参见demo。
结果:
> id | author | size | python | cpp | non-code
> -: | :----- | ----: | -----: | --: | -------:
> 1 | a | 28385 | 1 | 1 | 1
> 2 | b | 10633 | 2 | 1 | 1
> 3 | c | 2338 | 1 | 1 | 0
您还可以使用名为 tablefunc 的 PostgreSQL 交叉表函数扩展。我建议您看一下这个 link(给出的示例与您想要的结果非常相似):https://vertabelo.com/blog/creating-pivot-tables-in-postgresql-using-the-crosstab-function/
虽然您想避免使用 crosstab()
函数,但使用聚合 FILTER
子句以获得最佳性能和最清晰的代码:
SELECT id, author
, sum(size) AS size
, count(*) FILTER (WHERE file_ext = 'python') AS python
, count(*) FILTER (WHERE file_ext = 'cpp') AS cpp
, count(*) FILTER (WHERE file_ext = 'non-code') AS "non-code"
FROM tablename
GROUP BY id, author;
这是仅使用聚合函数的最快方法。参见:
- For absolute performance, is SUM faster or COUNT?
- How can I simplify this game statistics query?
为了绝对最佳性能,crosstab()
通常更快 - 尽管在这种情况下更冗长:
SELECT id, author, size
, COALESCE(python , 0) AS python
, COALESCE(cpp , 0) AS cpp
, COALESCE("non-code", 0) AS "non-code"
FROM crosstab(
$$
SELECT id, author
, sum(sum(size)) OVER (PARTITION BY id) AS size
, file_ext
, count(*) AS ct
FROM tablename
GROUP BY id, author, file_ext
ORDER BY id, author, file_ext
$$
, $$VALUES ('python'), ('cpp'), ('non-code')$$
) AS (id int, author text, size numeric
, python int, cpp int, "non-code" int);
同样的结果。
db<>fiddle here - 有中间步骤.
详细解释:
- PostgreSQL Crosstab Query
- Pivot on Multiple Columns using Tablefunc
关于聚合函数(sum(sum(size)) OVER (...)
)上的window函数,参见:
- Get the distinct sum of a joined table column
如果同一个 id
应该有多个 author
,请注意细微差别:在这种情况下,第一个查询 returns 多行,crosstab()
变体只选择第一作者。
我有一个table这样的
输入
id author size file_ext
--------------------------------
1 a 13661 python
1 a 13513 cpp
1 a 1211 non-code
2 b 1019 python
2 b 6881 cpp
2 b 1525 python
2 b 1208 non-code
3 c 1039 python
3 c 1299 cpp
我希望能够按以下方式旋转此 table
输出
id author size python cpp non-code
-------------------------------------------------
1 a 13661 1 0 0
1 a 13513 0 1 0
1 a 1211 0 0 1
2 b 1019 1 0 0
2 b 6881 0 1 0
2 b 1525 1 0 0
2 b 1208 0 0 1
3 c 1039 1 0 0
3 c 1299 0 1 0
我能在网上找到的所有文章 pivot tables 基于第二列。我的最终目标是为每个 ID 获取一条记录。
最终输出
id author size python cpp non-code
-------------------------------------------------
1 a 28385 1 1 1
2 b 10633 2 1 1
3 c 2338 1 1 0
此处汇总了大小、python、cpp、非代码列的值。
有条件聚合:
select
id, author,
sum(size) size,
sum((file_ext = 'python')::int) python,
sum((file_ext = 'cpp')::int) cpp,
sum((file_ext = 'non-code')::int) "non-code"
from tablename
group by id, author
参见demo。
结果:
> id | author | size | python | cpp | non-code
> -: | :----- | ----: | -----: | --: | -------:
> 1 | a | 28385 | 1 | 1 | 1
> 2 | b | 10633 | 2 | 1 | 1
> 3 | c | 2338 | 1 | 1 | 0
您还可以使用名为 tablefunc 的 PostgreSQL 交叉表函数扩展。我建议您看一下这个 link(给出的示例与您想要的结果非常相似):https://vertabelo.com/blog/creating-pivot-tables-in-postgresql-using-the-crosstab-function/
虽然您想避免使用 crosstab()
函数,但使用聚合 FILTER
子句以获得最佳性能和最清晰的代码:
SELECT id, author
, sum(size) AS size
, count(*) FILTER (WHERE file_ext = 'python') AS python
, count(*) FILTER (WHERE file_ext = 'cpp') AS cpp
, count(*) FILTER (WHERE file_ext = 'non-code') AS "non-code"
FROM tablename
GROUP BY id, author;
这是仅使用聚合函数的最快方法。参见:
- For absolute performance, is SUM faster or COUNT?
- How can I simplify this game statistics query?
为了绝对最佳性能,crosstab()
通常更快 - 尽管在这种情况下更冗长:
SELECT id, author, size
, COALESCE(python , 0) AS python
, COALESCE(cpp , 0) AS cpp
, COALESCE("non-code", 0) AS "non-code"
FROM crosstab(
$$
SELECT id, author
, sum(sum(size)) OVER (PARTITION BY id) AS size
, file_ext
, count(*) AS ct
FROM tablename
GROUP BY id, author, file_ext
ORDER BY id, author, file_ext
$$
, $$VALUES ('python'), ('cpp'), ('non-code')$$
) AS (id int, author text, size numeric
, python int, cpp int, "non-code" int);
同样的结果。
db<>fiddle here - 有中间步骤.
详细解释:
- PostgreSQL Crosstab Query
- Pivot on Multiple Columns using Tablefunc
关于聚合函数(sum(sum(size)) OVER (...)
)上的window函数,参见:
- Get the distinct sum of a joined table column
如果同一个 id
应该有多个 author
,请注意细微差别:在这种情况下,第一个查询 returns 多行,crosstab()
变体只选择第一作者。