如何以百分比形式显示查询中每个组的计数?

How can I show the counts of each group in my query as percentages?

我在 mytable(Movies) 中有一列,Actors。我正在尝试获取每个演员出现的次数百分比。

CREATE TABLE movies AS SELECT * FROM ( VALUES
  ('Robert DeSouza'),
  ('Tony Wagner'),
  ('Sean Cortese'),
  ('Robert DeSouza'),
  ('Robert DeSouza'),
  ('Tony Wagner'),
  ('Sean Cortese'),
  ('Charles Bastian'),
  ('Robert DeSouza')
) AS t(actors);

我要求的结果:

select Actors, (some formula * 100) as "The Ratio" from Movies

Actors                       The Ratio
Robert DeSouza                 44%
Tony Wagner                    22%
Sean Cortese                   22%
Charles Bastian                11%
                               100%

您可以使用 window functions 执行此操作。对于数值计算:

select m.actor,
       count(*) * 1.0 / sum(count(*)) over () as ratio
from movies m
group by m.actor;

您可以将比率转换为您想要的任何格式 -- 乘以 100 得到百分比,使用字符串连接添加百分比。对我来说,称为比率的东西应该在 0 和 1 之间(在这种情况下)。

SELECT actors, floor(count(*) *100 / sum(count(*)) OVER ())
FROM movies
GROUP BY actors
ORDER BY count(*) DESC;

这让你完成了大部分工作..

     actors      | floor 
-----------------+-------
 Robert DeSouza  |    44
 Tony Wagner     |    22
 Sean Cortese    |    22
 Charles Bastian |    11

不确定您的示例中的 100 是如何得到的。您想要百分比的下限,并且希望它神奇地显示 100?如果今天 44+22+22+11 = 100 那么它只是那些日子之一。但我们也可以做到。

SELECT actors AS "Actors", r::text || '%' AS "The Ratio"
FROM (
  SELECT
    actors AS "Actors",
    floor(count(*) *100 / sum(count(*)) OVER ()) AS r,
    false AS is_total
  FROM movies
  GROUP BY actors
  UNION ALL
    SELECT *
    FROM ( VALUES
      (null, 100, true)
    ) AS t(actors, floor, is_total)
  ORDER BY 3, 2 DESC
) AS t(actors,r);

输出,

     Actors      | The Ratio 
-----------------+-----------
 Robert DeSouza  | 44%
 Tony Wagner     | 22%
 Sean Cortese    | 22%
 Charles Bastian | 11%
                 | 100%

如果你不想 floor 你可以 round()

没有numeric type that includes a percent sign (the % character) so your problem can't be solved solely by an expression that calculates the numeric value. In addition to calculating that value, you need to format it as text using the to_char() function.

此函数接受一个数值,并使用您作为第二个参数提供的格式化文字将其转换为文本值。在这种情况下,看起来您想做的是四舍五入到最接近的百分比并显示百分号。您可能想使用 '990%' 作为格式化文字。将此添加到您的示例 table 和 会产生:

[local] air@postgres=> CREATE TABLE movies AS SELECT * FROM ( VALUES
...   ('Robert DeSouza'),
...   ('Tony Wagner'),
...   ('Sean Cortese'),
...   ('Robert DeSouza'),
...   ('Robert DeSouza'),
...   ('Tony Wagner'),
...   ('Sean Cortese'),
...   ('Charles Bastian'),
...   ('Robert DeSouza')
... ) AS t(actors);

SELECT 9
Time: 715.613 ms
[local] air@postgres=> select actors, to_char(100 * count(*) / sum(count(*)) over (), '990%') as "The Ratio" from movies group by actors;
┌─────────────────┬───────────┐
│     actors      │ The Ratio │
├─────────────────┼───────────┤
│ Charles Bastian │   11%     │
│ Tony Wagner     │   22%     │
│ Sean Cortese    │   22%     │
│ Robert DeSouza  │   44%     │
└─────────────────┴───────────┘

(4 rows)

Time: 31.501 ms

您要确保考虑到显示所有可能值的需要,包括 100% 和 0%;由于 to_char() 会四舍五入以适应您想要的精度,因此尽管存在于 table:

中,演员的比率仍可能显示为零
[local] air@postgres=> delete from movies where actors <> 'Tony Wagner';
DELETE 7
Time: 36.697 ms
[local] ahuth@postgres=> insert into movies (actors) select 'Not Tony Wagner' from generate_series(1,500);
INSERT 0 500
Time: 149.022 ms
[local] ahuth@postgres=> select actors, to_char(100 * count(*) / sum(count(*)) over (), '990%') as "The Ratio" from movies group by actors;
┌─────────────────┬───────────┐
│     actors      │ The Ratio │
├─────────────────┼───────────┤
│ Tony Wagner     │    0%     │
│ Not Tony Wagner │  100%     │
└─────────────────┴───────────┘
(2 rows)

Time: 0.776 ms

如果要扩展它以显示小数位,只需修改格式字符串即可。当您想要强制前导或尾随零时,请在格式化文字中使用 0

这通过使用并集将分组结果与总体结果相结合来实现。

编辑:在@Air 的评论后删除了连接和 0。

select actors actor, ratio from ( 
    select 0 sort
    , actors
    , round(count(*) * 100.0 / ( select count(*) 
    from movies ),0) ratio 
    from movies 
    group by actors 
    union 
    select 1 sort
    , 'Total'
    , round(count(*) / count(*) * 100,0) ratio 
    from movies 
) actors 
order by sort, actors ; 

-- 结果

      actor      | ratio 
-----------------+-------
 Charles Bastian | 11
 Robert DeSouza  | 44
 Sean Cortese    | 22
 Tony Wagner     | 22
 Total           | 100
(5 rows)