计算 SQL 中计数相对于类别的百分比
Calculate percentages of count in SQL relative to the category
我在 MySQL 工作。
最初我有一个看起来像这样的 table:
我的任务是计算在等候名单较长的人和等候时间较短的人中被取消的预订百分比。
经过一些操作后,我想出了以下代码:
SELECT
CASE WHEN days_in_waiting_list > (SELECT AVG(days_in_waiting_list) FROM Bookings) THEN 'Long wait'
ELSE 'Short wait'
END AS waiting,
is_canceled, COUNT(*), count(*) * 100.0 / sum(count(*)) over() AS perc_cancelled
FROM Bookings
GROUP BY waiting, is_canceled;
结果table:
但我希望计算类别的百分比,而不是整个 table。因此,Short wait 的百分比总和等于 100,Long wait 也是如此。我希望它是这样的:
waiting
is_cancelled
perc
Short wait
0
0.61
Short wait
1
0.39
Long wait
0
0.32
Long wait
1
0.68
有办法吗?我知道可以使用 over(partition by waiting),但它给了我错误
Error Code: 1054. Unknown column 'waiting' in 'window partition by'
您希望按类别计算百分比,因此 perc_cancelled
需要按等待类别分组
count() * 100.0 / sum(count()) over(partition by waiting) AS
perc_cancelled
WITH cte AS (
SELECT
CASE WHEN days_in_waiting_list > (SELECT AVG(days_in_waiting_list) FROM Bookings)
THEN 'Long wait'
ELSE 'Short wait'
END AS waiting,
is_canceled,
COUNT(*) as subTotal,
sum(count(*) over (partition by waiting ORDER BY is_canceled ASC) as totalSum,
FROM Bookings
GROUP BY waiting, is_canceled;
)
SELECT waiting,
is_canceled,
subTotal* (100.0)/totalSum as percentage
FROM cte
group by waiting,is_canceled
我会使用 window 函数计算平均值,然后聚合:
select waiting, is_cancelled,
count(*) / sum(count(*)) over(partition by waiting) as ratio
from (
select b.*,
case when days_in_waiting_list > avg(days_in_waiting_list) over()
then 'Long Wait'
else 'Short wait'
end as waiting
from bookings b
) b
group by waiting, is_cancelled
order by waiting, is_cancelled
仅使用相关数据和 40 个随机行创建 table:
CREATE TABLE test (
id INT AUTO_INCREMENT PRIMARY KEY,
is_canceled INT DEFAULT FLOOR(RAND() * 2),
days_in_waiting_list INT DEFAULT FLOOR(RAND() * 100)
);
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
你真的很接近,只需添加 PARTITION BY waiting
:
SELECT
CASE WHEN days_in_waiting_list > (
SELECT AVG(days_in_waiting_list)
FROM test
) THEN 'Long' ELSE 'Short' END AS waiting,
is_canceled,
COUNT(*),
COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (PARTITION BY waiting)
FROM test
GROUP BY waiting, is_canceled;
waiting
is_canceled
count(*)
count() / sum(count()) over (partition by waiting)
Long
0
10
58.82353
Long
1
7
41.17647
Short
0
15
65.21739
Short
1
8
34.78261
我在 MySQL 工作。 最初我有一个看起来像这样的 table:
我的任务是计算在等候名单较长的人和等候时间较短的人中被取消的预订百分比。 经过一些操作后,我想出了以下代码:
SELECT
CASE WHEN days_in_waiting_list > (SELECT AVG(days_in_waiting_list) FROM Bookings) THEN 'Long wait'
ELSE 'Short wait'
END AS waiting,
is_canceled, COUNT(*), count(*) * 100.0 / sum(count(*)) over() AS perc_cancelled
FROM Bookings
GROUP BY waiting, is_canceled;
结果table:
但我希望计算类别的百分比,而不是整个 table。因此,Short wait 的百分比总和等于 100,Long wait 也是如此。我希望它是这样的:
waiting | is_cancelled | perc |
---|---|---|
Short wait | 0 | 0.61 |
Short wait | 1 | 0.39 |
Long wait | 0 | 0.32 |
Long wait | 1 | 0.68 |
有办法吗?我知道可以使用 over(partition by waiting),但它给了我错误
Error Code: 1054. Unknown column 'waiting' in 'window partition by'
您希望按类别计算百分比,因此 perc_cancelled
需要按等待类别分组count() * 100.0 / sum(count()) over(partition by waiting) AS perc_cancelled
WITH cte AS (
SELECT
CASE WHEN days_in_waiting_list > (SELECT AVG(days_in_waiting_list) FROM Bookings)
THEN 'Long wait'
ELSE 'Short wait'
END AS waiting,
is_canceled,
COUNT(*) as subTotal,
sum(count(*) over (partition by waiting ORDER BY is_canceled ASC) as totalSum,
FROM Bookings
GROUP BY waiting, is_canceled;
)
SELECT waiting,
is_canceled,
subTotal* (100.0)/totalSum as percentage
FROM cte
group by waiting,is_canceled
我会使用 window 函数计算平均值,然后聚合:
select waiting, is_cancelled,
count(*) / sum(count(*)) over(partition by waiting) as ratio
from (
select b.*,
case when days_in_waiting_list > avg(days_in_waiting_list) over()
then 'Long Wait'
else 'Short wait'
end as waiting
from bookings b
) b
group by waiting, is_cancelled
order by waiting, is_cancelled
仅使用相关数据和 40 个随机行创建 table:
CREATE TABLE test (
id INT AUTO_INCREMENT PRIMARY KEY,
is_canceled INT DEFAULT FLOOR(RAND() * 2),
days_in_waiting_list INT DEFAULT FLOOR(RAND() * 100)
);
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
你真的很接近,只需添加 PARTITION BY waiting
:
SELECT
CASE WHEN days_in_waiting_list > (
SELECT AVG(days_in_waiting_list)
FROM test
) THEN 'Long' ELSE 'Short' END AS waiting,
is_canceled,
COUNT(*),
COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (PARTITION BY waiting)
FROM test
GROUP BY waiting, is_canceled;
waiting | is_canceled | count(*) | count() / sum(count()) over (partition by waiting) |
---|---|---|---|
Long | 0 | 10 | 58.82353 |
Long | 1 | 7 | 41.17647 |
Short | 0 | 15 | 65.21739 |
Short | 1 | 8 | 34.78261 |