如何以百分比形式显示查询中每个组的计数?
How can I show the counts of each group in my query as percentages?
我在 mytable(Movies) 中有一列,Actors。我正在尝试获取每个演员出现的次数百分比。
CREATE TABLE movies AS SELECT * FROM ( VALUES
('Robert DeSouza'),
('Tony Wagner'),
('Sean Cortese'),
('Robert DeSouza'),
('Robert DeSouza'),
('Tony Wagner'),
('Sean Cortese'),
('Charles Bastian'),
('Robert DeSouza')
) AS t(actors);
我要求的结果:
select Actors, (some formula * 100) as "The Ratio" from Movies
Actors The Ratio
Robert DeSouza 44%
Tony Wagner 22%
Sean Cortese 22%
Charles Bastian 11%
100%
您可以使用 window functions 执行此操作。对于数值计算:
select m.actor,
count(*) * 1.0 / sum(count(*)) over () as ratio
from movies m
group by m.actor;
您可以将比率转换为您想要的任何格式 -- 乘以 100 得到百分比,使用字符串连接添加百分比。对我来说,称为比率的东西应该在 0 和 1 之间(在这种情况下)。
SELECT actors, floor(count(*) *100 / sum(count(*)) OVER ())
FROM movies
GROUP BY actors
ORDER BY count(*) DESC;
这让你完成了大部分工作..
actors | floor
-----------------+-------
Robert DeSouza | 44
Tony Wagner | 22
Sean Cortese | 22
Charles Bastian | 11
不确定您的示例中的 100 是如何得到的。您想要百分比的下限,并且希望它神奇地显示 100?如果今天 44+22+22+11 = 100
那么它只是那些日子之一。但我们也可以做到。
SELECT actors AS "Actors", r::text || '%' AS "The Ratio"
FROM (
SELECT
actors AS "Actors",
floor(count(*) *100 / sum(count(*)) OVER ()) AS r,
false AS is_total
FROM movies
GROUP BY actors
UNION ALL
SELECT *
FROM ( VALUES
(null, 100, true)
) AS t(actors, floor, is_total)
ORDER BY 3, 2 DESC
) AS t(actors,r);
输出,
Actors | The Ratio
-----------------+-----------
Robert DeSouza | 44%
Tony Wagner | 22%
Sean Cortese | 22%
Charles Bastian | 11%
| 100%
如果你不想 floor
你可以 round()
没有numeric type that includes a percent sign (the %
character) so your problem can't be solved solely by an expression that calculates the numeric value. In addition to calculating that value, you need to format it as text using the to_char()
function.
此函数接受一个数值,并使用您作为第二个参数提供的格式化文字将其转换为文本值。在这种情况下,看起来您想做的是四舍五入到最接近的百分比并显示百分号。您可能想使用 '990%'
作为格式化文字。将此添加到您的示例 table 和 会产生:
[local] air@postgres=> CREATE TABLE movies AS SELECT * FROM ( VALUES
... ('Robert DeSouza'),
... ('Tony Wagner'),
... ('Sean Cortese'),
... ('Robert DeSouza'),
... ('Robert DeSouza'),
... ('Tony Wagner'),
... ('Sean Cortese'),
... ('Charles Bastian'),
... ('Robert DeSouza')
... ) AS t(actors);
SELECT 9
Time: 715.613 ms
[local] air@postgres=> select actors, to_char(100 * count(*) / sum(count(*)) over (), '990%') as "The Ratio" from movies group by actors;
┌─────────────────┬───────────┐
│ actors │ The Ratio │
├─────────────────┼───────────┤
│ Charles Bastian │ 11% │
│ Tony Wagner │ 22% │
│ Sean Cortese │ 22% │
│ Robert DeSouza │ 44% │
└─────────────────┴───────────┘
(4 rows)
Time: 31.501 ms
您要确保考虑到显示所有可能值的需要,包括 100% 和 0%;由于 to_char()
会四舍五入以适应您想要的精度,因此尽管存在于 table:
中,演员的比率仍可能显示为零
[local] air@postgres=> delete from movies where actors <> 'Tony Wagner';
DELETE 7
Time: 36.697 ms
[local] ahuth@postgres=> insert into movies (actors) select 'Not Tony Wagner' from generate_series(1,500);
INSERT 0 500
Time: 149.022 ms
[local] ahuth@postgres=> select actors, to_char(100 * count(*) / sum(count(*)) over (), '990%') as "The Ratio" from movies group by actors;
┌─────────────────┬───────────┐
│ actors │ The Ratio │
├─────────────────┼───────────┤
│ Tony Wagner │ 0% │
│ Not Tony Wagner │ 100% │
└─────────────────┴───────────┘
(2 rows)
Time: 0.776 ms
如果要扩展它以显示小数位,只需修改格式字符串即可。当您想要强制前导或尾随零时,请在格式化文字中使用 0
。
这通过使用并集将分组结果与总体结果相结合来实现。
编辑:在@Air 的评论后删除了连接和 0。
select actors actor, ratio from (
select 0 sort
, actors
, round(count(*) * 100.0 / ( select count(*)
from movies ),0) ratio
from movies
group by actors
union
select 1 sort
, 'Total'
, round(count(*) / count(*) * 100,0) ratio
from movies
) actors
order by sort, actors ;
-- 结果
actor | ratio
-----------------+-------
Charles Bastian | 11
Robert DeSouza | 44
Sean Cortese | 22
Tony Wagner | 22
Total | 100
(5 rows)
我在 mytable(Movies) 中有一列,Actors。我正在尝试获取每个演员出现的次数百分比。
CREATE TABLE movies AS SELECT * FROM ( VALUES
('Robert DeSouza'),
('Tony Wagner'),
('Sean Cortese'),
('Robert DeSouza'),
('Robert DeSouza'),
('Tony Wagner'),
('Sean Cortese'),
('Charles Bastian'),
('Robert DeSouza')
) AS t(actors);
我要求的结果:
select Actors, (some formula * 100) as "The Ratio" from Movies
Actors The Ratio
Robert DeSouza 44%
Tony Wagner 22%
Sean Cortese 22%
Charles Bastian 11%
100%
您可以使用 window functions 执行此操作。对于数值计算:
select m.actor,
count(*) * 1.0 / sum(count(*)) over () as ratio
from movies m
group by m.actor;
您可以将比率转换为您想要的任何格式 -- 乘以 100 得到百分比,使用字符串连接添加百分比。对我来说,称为比率的东西应该在 0 和 1 之间(在这种情况下)。
SELECT actors, floor(count(*) *100 / sum(count(*)) OVER ())
FROM movies
GROUP BY actors
ORDER BY count(*) DESC;
这让你完成了大部分工作..
actors | floor
-----------------+-------
Robert DeSouza | 44
Tony Wagner | 22
Sean Cortese | 22
Charles Bastian | 11
不确定您的示例中的 100 是如何得到的。您想要百分比的下限,并且希望它神奇地显示 100?如果今天 44+22+22+11 = 100
那么它只是那些日子之一。但我们也可以做到。
SELECT actors AS "Actors", r::text || '%' AS "The Ratio"
FROM (
SELECT
actors AS "Actors",
floor(count(*) *100 / sum(count(*)) OVER ()) AS r,
false AS is_total
FROM movies
GROUP BY actors
UNION ALL
SELECT *
FROM ( VALUES
(null, 100, true)
) AS t(actors, floor, is_total)
ORDER BY 3, 2 DESC
) AS t(actors,r);
输出,
Actors | The Ratio
-----------------+-----------
Robert DeSouza | 44%
Tony Wagner | 22%
Sean Cortese | 22%
Charles Bastian | 11%
| 100%
如果你不想 floor
你可以 round()
没有numeric type that includes a percent sign (the %
character) so your problem can't be solved solely by an expression that calculates the numeric value. In addition to calculating that value, you need to format it as text using the to_char()
function.
此函数接受一个数值,并使用您作为第二个参数提供的格式化文字将其转换为文本值。在这种情况下,看起来您想做的是四舍五入到最接近的百分比并显示百分号。您可能想使用 '990%'
作为格式化文字。将此添加到您的示例 table 和
[local] air@postgres=> CREATE TABLE movies AS SELECT * FROM ( VALUES
... ('Robert DeSouza'),
... ('Tony Wagner'),
... ('Sean Cortese'),
... ('Robert DeSouza'),
... ('Robert DeSouza'),
... ('Tony Wagner'),
... ('Sean Cortese'),
... ('Charles Bastian'),
... ('Robert DeSouza')
... ) AS t(actors);
SELECT 9
Time: 715.613 ms
[local] air@postgres=> select actors, to_char(100 * count(*) / sum(count(*)) over (), '990%') as "The Ratio" from movies group by actors;
┌─────────────────┬───────────┐
│ actors │ The Ratio │
├─────────────────┼───────────┤
│ Charles Bastian │ 11% │
│ Tony Wagner │ 22% │
│ Sean Cortese │ 22% │
│ Robert DeSouza │ 44% │
└─────────────────┴───────────┘
(4 rows)
Time: 31.501 ms
您要确保考虑到显示所有可能值的需要,包括 100% 和 0%;由于 to_char()
会四舍五入以适应您想要的精度,因此尽管存在于 table:
[local] air@postgres=> delete from movies where actors <> 'Tony Wagner';
DELETE 7
Time: 36.697 ms
[local] ahuth@postgres=> insert into movies (actors) select 'Not Tony Wagner' from generate_series(1,500);
INSERT 0 500
Time: 149.022 ms
[local] ahuth@postgres=> select actors, to_char(100 * count(*) / sum(count(*)) over (), '990%') as "The Ratio" from movies group by actors;
┌─────────────────┬───────────┐
│ actors │ The Ratio │
├─────────────────┼───────────┤
│ Tony Wagner │ 0% │
│ Not Tony Wagner │ 100% │
└─────────────────┴───────────┘
(2 rows)
Time: 0.776 ms
如果要扩展它以显示小数位,只需修改格式字符串即可。当您想要强制前导或尾随零时,请在格式化文字中使用 0
。
这通过使用并集将分组结果与总体结果相结合来实现。
编辑:在@Air 的评论后删除了连接和 0。
select actors actor, ratio from (
select 0 sort
, actors
, round(count(*) * 100.0 / ( select count(*)
from movies ),0) ratio
from movies
group by actors
union
select 1 sort
, 'Total'
, round(count(*) / count(*) * 100,0) ratio
from movies
) actors
order by sort, actors ;
-- 结果
actor | ratio
-----------------+-------
Charles Bastian | 11
Robert DeSouza | 44
Sean Cortese | 22
Tony Wagner | 22
Total | 100
(5 rows)