计算 SQL 中计数相对于类别的百分比

Calculate percentages of count in SQL relative to the category

我在 MySQL 工作。 最初我有一个看起来像这样的 table:

我的任务是计算在等候名单较长的人和等候时间较短的人中被取消的预订百分比。 经过一些操作后,我想出了以下代码:

SELECT
CASE WHEN days_in_waiting_list > (SELECT AVG(days_in_waiting_list) FROM Bookings) THEN 'Long wait'
ELSE 'Short wait' 
END AS waiting, 
is_canceled, COUNT(*), count(*) * 100.0 / sum(count(*)) over() AS perc_cancelled
FROM Bookings
GROUP BY waiting, is_canceled;

结果table:

但我希望计算类别的百分比,而不是整个 table。因此,Short wait 的百分比总和等于 100,Long wait 也是如此。我希望它是这样的:

waiting is_cancelled perc
Short wait 0 0.61
Short wait 1 0.39
Long wait 0 0.32
Long wait 1 0.68

有办法吗?我知道可以使用 over(partition by waiting),但它给了我错误

Error Code: 1054. Unknown column 'waiting' in 'window partition by'

您希望按类别计算百分比,因此 perc_cancelled

需要按等待类别分组

count() * 100.0 / sum(count()) over(partition by waiting) AS perc_cancelled

WITH cte AS (
SELECT
CASE WHEN days_in_waiting_list > (SELECT AVG(days_in_waiting_list) FROM Bookings) 
THEN 'Long wait'
ELSE 'Short wait' 
END AS waiting, 
is_canceled,
COUNT(*) as subTotal, 
sum(count(*) over (partition by waiting ORDER BY is_canceled ASC) as totalSum,
FROM Bookings
GROUP BY waiting, is_canceled;
)
SELECT waiting,
is_canceled,
subTotal* (100.0)/totalSum as percentage
FROM cte
group by waiting,is_canceled

我会使用 window 函数计算平均值,然后聚合:

select waiting, is_cancelled, 
    count(*) / sum(count(*)) over(partition by waiting) as ratio
from (
    select b.*, 
        case when days_in_waiting_list > avg(days_in_waiting_list) over()
            then 'Long Wait'
            else 'Short wait'
        end as waiting
    from bookings b
) b
group by waiting, is_cancelled
order by waiting, is_cancelled

仅使用相关数据和 40 个随机行创建 table:

CREATE TABLE test (
  id INT AUTO_INCREMENT PRIMARY KEY,
  is_canceled INT DEFAULT FLOOR(RAND() * 2),
  days_in_waiting_list INT DEFAULT FLOOR(RAND() * 100)
);
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();

你真的很接近,只需添加 PARTITION BY waiting:

SELECT
  CASE WHEN days_in_waiting_list > (
    SELECT AVG(days_in_waiting_list)
    FROM test
  ) THEN 'Long' ELSE 'Short' END AS waiting,
  is_canceled,
  COUNT(*),
  COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (PARTITION BY waiting)
FROM test
GROUP BY waiting, is_canceled;
waiting is_canceled count(*) count() / sum(count()) over (partition by waiting)
Long 0 10 58.82353
Long 1 7 41.17647
Short 0 15 65.21739
Short 1 8 34.78261