跨多个未知数量列的每一行的 PostgreSQL 聚合函数
PostgreSQL aggregate function for each row across multiple unknown number of columns
我查看了类似 this one 的问题,但它们似乎有 确定的 列。我想输入一个不知道列数的table
问题:
如果列数事先未知,如何计算跨多列的每一行的聚合函数(例如avg()
或sum()
)?
我已经将输入 table panel_stats_rnd
csv 和一个 DLL 用于创建它 here。
我想为每一行计算 rnd_avg_parcelcount
作为所有列 c_1_avg_parcelcount
、c_2_avg_parcelcount
、...的平均值,我可以在其中输入 tables _avg_parcelcount
的任意数量(比如 100)列。对于 rnd_sum_parcelcount
列,我想计算以 c_
开头并以 _sum_parcelcount
.
结尾的所有列的 sum()
table 看起来像这样:
SELECT * FROM panel_stats_rnd;
gid | d | dist_from | dist_to | distlabel | rnd_avg_parcelcount | rnd_sum_parcelcount | rnd_avg_callcount | rnd_sum_callcount | rnd_avg_perccalled | called_avg_parcelcount | called_sum_parcelcount | called_avg_callcount | called_sum_callcount | called_avg_perccalled | c_1_avg_parcelcount | c_1_sum_parcelcount | c_1_avg_callcount | c_1_sum_callcount | c_1_avg_perccalled | c_2_avg_parcelcount | c_2_sum_parcelcount | c_2_avg_callcount | c_2_sum_callcount | c_2_avg_perccalled
-----+----+-----------+---------+-----------+---------------------+---------------------+-------------------+-------------------+--------------------+------------------------+------------------------+----------------------+----------------------+-----------------------+---------------------+---------------------+-------------------+-------------------+----------------------+---------------------+---------------------+-------------------+-------------------+----------------------
1 | 0 | 0 | 100 | 0-100 | | | | | | 119045 | 119045 | 119045 | 23 | 0.000193204250493511 | 119045 | 119045 | 119045 | 16 | 0.000134402956865051 | 119045 | 119045 | 119045 | 16 | 0.000134402956865051
2 | 1 | 100 | 200 | 100-200 | | | | | | 163140 | 163140 | 163140 | 22 | 0.000134853500061297 | 163140 | 163140 | 163140 | 17 | 0.000104204977320093 | 163140 | 163140 | 163140 | 18 | 0.000110334681868334
3 | 2 | 200 | 300 | 200-300 | | | | | | 135934 | 135934 | 135934 | 10 | 7.3565112481057e-05 | 135934 | 135934 | 135934 | 18 | 0.000132417202465903 | 135934 | 135934 | 135934 | 15 | 0.000110347668721585
4 | 3 | 300 | 400 | 300-400 | | | | | | 116874 | 116874 | 116874 | 13 | 0.000111230898232284 | 116874 | 116874 | 116874 | 11 | 9.41184523503944e-05 | 116874 | 116874 | 116874 | 18 | 0.000154012012937009
5 | 4 | 400 | 500 | 400-500 | | | | | | 93216 | 93216 | 93216 | 12 | 0.000128733264675592 | 93216 | 93216 | 93216 | 10 | 0.000107277720562993 | 93216 | 93216 | 93216 | 12 | 0.000128733264675592
6 | 5 | 500 | 600 | 500-600 | | | | | | 69992 | 69992 | 69992 | 7 | 0.0001000114298777 | 69992 | 69992 | 69992 | 10 | 0.000142873471253858 | 69992 | 69992 | 69992 | 7 | 0.0001000114298777
7 | 6 | 600 | 700 | 600-700 | | | | | | 50816 | 50816 | 50816 | 10 | 0.000196788413098237 | 50816 | 50816 | 50816 | 6 | 0.000118073047858942 | 50816 | 50816 | 50816 | 0 | 0
8 | 7 | 700 | 800 | 700-800 | | | | | | 34814 | 34814 | 34814 | 0 | 0 | 34814 | 34814 | 34814 | 6 | 0.000172344459125639 | 34814 | 34814 | 34814 | 4 | 0.000114896306083759
9 | 8 | 800 | 900 | 800-900 | | | | | | 23023 | 23023 | 23023 | 1 | 4.34348260435217e-05 | 23023 | 23023 | 23023 | 4 | 0.000173739304174087 | 23023 | 23023 | 23023 | 1 | 4.34348260435217e-05
10 | 9 | 900 | 1000 | 900-1000 | | | | | | 14215 | 14215 | 14215 | 1 | 7.03482237073514e-05 | 14215 | 14215 | 14215 | 1 | 7.03482237073514e-05 | 14215 | 14215 | 14215 | 5 | 0.000351741118536757
11 | 10 | 1000 | 5000 | 1000-5000 | | | | | | 23527 | 23527 | 23527 | 0 | 0 | 23527 | 23527 | 23527 | 0 | 0 | 23527 | 23527 | 23527 | 3 | 0.000127513070089684
(11 rows)
我为 2 列尝试了以下操作(可行,但我不想为 100 列写 5 次,此外列数必须是一个参数):
SELECT d,c_1_avg_parcelcount,c_2_avg_parcelcount,
(SELECT avg(c) FROM (VALUES (c_1_avg_parcelcount) , (c_2_avg_parcelcount) ) T (c)) AS Avg_,
(SELECT sum(c) FROM (VALUES (c_1_avg_parcelcount) , (c_2_avg_parcelcount) ) T (c)) AS sum_
FROM panel_stats_rnd;
我也尝试了以下但不起作用。
WITH cols AS (
select value(column_name) from information_schema.columns
where table_name = 'panel_stats_rnd'
AND column_name SIMILAR TO 'c_%avg_parcelcount'
AND column_name != 'called_avg_parcelcount'
)
SELECT *, (SELECT avg(Col) FROM cols V(Col) ) AS col_average
FROM panel_stats_rnd;
我快到了,但缺少一些东西...
select
*,
(select avg(v::numeric)
from json_each_text(row_to_json(panel_stats_rnd.*)) as j(k,v)
where k like 'c\_%\_avg\_parcelcount') as rnd_avg_parcelcount,
(select sum(v::numeric)
from json_each_text(row_to_json(panel_stats_rnd.*)) as j(k,v)
where k like 'c\_%\_sum\_parcelcount') as rnd_sum_parcelcount
from
panel_stats_rnd;
查看documentation涉及的函数
底层字符 (\_
) 有转义,因为对于 like
运算符,它意味着 任何 单个字符,例如 select 'a' like '_';
是 true
.
我查看了类似 this one 的问题,但它们似乎有 确定的 列。我想输入一个不知道列数的table
问题:
如果列数事先未知,如何计算跨多列的每一行的聚合函数(例如avg()
或sum()
)?
我已经将输入 table panel_stats_rnd
csv 和一个 DLL 用于创建它 here。
我想为每一行计算 rnd_avg_parcelcount
作为所有列 c_1_avg_parcelcount
、c_2_avg_parcelcount
、...的平均值,我可以在其中输入 tables _avg_parcelcount
的任意数量(比如 100)列。对于 rnd_sum_parcelcount
列,我想计算以 c_
开头并以 _sum_parcelcount
.
sum()
table 看起来像这样:
SELECT * FROM panel_stats_rnd;
gid | d | dist_from | dist_to | distlabel | rnd_avg_parcelcount | rnd_sum_parcelcount | rnd_avg_callcount | rnd_sum_callcount | rnd_avg_perccalled | called_avg_parcelcount | called_sum_parcelcount | called_avg_callcount | called_sum_callcount | called_avg_perccalled | c_1_avg_parcelcount | c_1_sum_parcelcount | c_1_avg_callcount | c_1_sum_callcount | c_1_avg_perccalled | c_2_avg_parcelcount | c_2_sum_parcelcount | c_2_avg_callcount | c_2_sum_callcount | c_2_avg_perccalled
-----+----+-----------+---------+-----------+---------------------+---------------------+-------------------+-------------------+--------------------+------------------------+------------------------+----------------------+----------------------+-----------------------+---------------------+---------------------+-------------------+-------------------+----------------------+---------------------+---------------------+-------------------+-------------------+----------------------
1 | 0 | 0 | 100 | 0-100 | | | | | | 119045 | 119045 | 119045 | 23 | 0.000193204250493511 | 119045 | 119045 | 119045 | 16 | 0.000134402956865051 | 119045 | 119045 | 119045 | 16 | 0.000134402956865051
2 | 1 | 100 | 200 | 100-200 | | | | | | 163140 | 163140 | 163140 | 22 | 0.000134853500061297 | 163140 | 163140 | 163140 | 17 | 0.000104204977320093 | 163140 | 163140 | 163140 | 18 | 0.000110334681868334
3 | 2 | 200 | 300 | 200-300 | | | | | | 135934 | 135934 | 135934 | 10 | 7.3565112481057e-05 | 135934 | 135934 | 135934 | 18 | 0.000132417202465903 | 135934 | 135934 | 135934 | 15 | 0.000110347668721585
4 | 3 | 300 | 400 | 300-400 | | | | | | 116874 | 116874 | 116874 | 13 | 0.000111230898232284 | 116874 | 116874 | 116874 | 11 | 9.41184523503944e-05 | 116874 | 116874 | 116874 | 18 | 0.000154012012937009
5 | 4 | 400 | 500 | 400-500 | | | | | | 93216 | 93216 | 93216 | 12 | 0.000128733264675592 | 93216 | 93216 | 93216 | 10 | 0.000107277720562993 | 93216 | 93216 | 93216 | 12 | 0.000128733264675592
6 | 5 | 500 | 600 | 500-600 | | | | | | 69992 | 69992 | 69992 | 7 | 0.0001000114298777 | 69992 | 69992 | 69992 | 10 | 0.000142873471253858 | 69992 | 69992 | 69992 | 7 | 0.0001000114298777
7 | 6 | 600 | 700 | 600-700 | | | | | | 50816 | 50816 | 50816 | 10 | 0.000196788413098237 | 50816 | 50816 | 50816 | 6 | 0.000118073047858942 | 50816 | 50816 | 50816 | 0 | 0
8 | 7 | 700 | 800 | 700-800 | | | | | | 34814 | 34814 | 34814 | 0 | 0 | 34814 | 34814 | 34814 | 6 | 0.000172344459125639 | 34814 | 34814 | 34814 | 4 | 0.000114896306083759
9 | 8 | 800 | 900 | 800-900 | | | | | | 23023 | 23023 | 23023 | 1 | 4.34348260435217e-05 | 23023 | 23023 | 23023 | 4 | 0.000173739304174087 | 23023 | 23023 | 23023 | 1 | 4.34348260435217e-05
10 | 9 | 900 | 1000 | 900-1000 | | | | | | 14215 | 14215 | 14215 | 1 | 7.03482237073514e-05 | 14215 | 14215 | 14215 | 1 | 7.03482237073514e-05 | 14215 | 14215 | 14215 | 5 | 0.000351741118536757
11 | 10 | 1000 | 5000 | 1000-5000 | | | | | | 23527 | 23527 | 23527 | 0 | 0 | 23527 | 23527 | 23527 | 0 | 0 | 23527 | 23527 | 23527 | 3 | 0.000127513070089684
(11 rows)
我为 2 列尝试了以下操作(可行,但我不想为 100 列写 5 次,此外列数必须是一个参数):
SELECT d,c_1_avg_parcelcount,c_2_avg_parcelcount,
(SELECT avg(c) FROM (VALUES (c_1_avg_parcelcount) , (c_2_avg_parcelcount) ) T (c)) AS Avg_,
(SELECT sum(c) FROM (VALUES (c_1_avg_parcelcount) , (c_2_avg_parcelcount) ) T (c)) AS sum_
FROM panel_stats_rnd;
我也尝试了以下但不起作用。
WITH cols AS (
select value(column_name) from information_schema.columns
where table_name = 'panel_stats_rnd'
AND column_name SIMILAR TO 'c_%avg_parcelcount'
AND column_name != 'called_avg_parcelcount'
)
SELECT *, (SELECT avg(Col) FROM cols V(Col) ) AS col_average
FROM panel_stats_rnd;
我快到了,但缺少一些东西...
select
*,
(select avg(v::numeric)
from json_each_text(row_to_json(panel_stats_rnd.*)) as j(k,v)
where k like 'c\_%\_avg\_parcelcount') as rnd_avg_parcelcount,
(select sum(v::numeric)
from json_each_text(row_to_json(panel_stats_rnd.*)) as j(k,v)
where k like 'c\_%\_sum\_parcelcount') as rnd_sum_parcelcount
from
panel_stats_rnd;
查看documentation涉及的函数
底层字符 (\_
) 有转义,因为对于 like
运算符,它意味着 任何 单个字符,例如 select 'a' like '_';
是 true
.