如何计算Postgres中的中位数?
How to calculate the median in Postgres?
我已经创建了一个基本数据库(附图片)Database,我正在尝试查找以下内容:
"Median total amount spent per user in each calendar month"
我尝试了以下方法,但出现错误:
SELECT
user_id,
AVG(total_per_user)
FROM (SELECT user_id,
ROW_NUMBER() over (ORDER BY total_per_user DESC) AS desc_total,
ROW_NUMBER() over (ORDER BY total_per_user ASC) AS asc_total
FROM (SELECT EXTRACT(MONTH FROM created_at) AS calendar_month,
user_id,
SUM(amount) AS total_per_user
FROM transactions
GROUP BY calendar_month, user_id) AS total_amount
ORDER BY user_id) AS a
WHERE asc_total IN (desc_total, desc_total+1, desc_total-1)
GROUP BY user_id
;
在 Postgres 中,您可以只使用 aggregate function percentile_cont()
:
select
user_id,
percentile_cont(0.5) within group(order by total_per_user) median_total_per_user
from (
select user_id, sum(amount) total_per_user
from transactions
group by date_trunc('month', created_at), user_id
) t
group by user_id
请注意,date_trunc()
可能比 extract(month from ...)
更接近您想要的 - 除非您确实想将不同年份的同一个月的金额加在一起,这不是我理解您的要求的方式.
只需使用percentile_cont()
。我不完全理解这个问题。如果您想要每月支出的中位数,则:
SELECT user_id,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_per_user
ROW_NUMBER() over (ORDER BY total_per_user DESC) AS desc_total,
ROW_NUMBER() over (ORDER BY total_per_user ASC) AS asc_total
FROM (SELECT DATE_TRUNC('month', created_at) AS calendar_month,
user_id, SUM(amount) AS total_per_user
FROM transactions t
GROUP BY calendar_month, user_id
) um
GROUP BY user_id;
中位数有一个内置函数。不需要更复杂的处理。
我已经创建了一个基本数据库(附图片)Database,我正在尝试查找以下内容:
"Median total amount spent per user in each calendar month"
我尝试了以下方法,但出现错误:
SELECT
user_id,
AVG(total_per_user)
FROM (SELECT user_id,
ROW_NUMBER() over (ORDER BY total_per_user DESC) AS desc_total,
ROW_NUMBER() over (ORDER BY total_per_user ASC) AS asc_total
FROM (SELECT EXTRACT(MONTH FROM created_at) AS calendar_month,
user_id,
SUM(amount) AS total_per_user
FROM transactions
GROUP BY calendar_month, user_id) AS total_amount
ORDER BY user_id) AS a
WHERE asc_total IN (desc_total, desc_total+1, desc_total-1)
GROUP BY user_id
;
在 Postgres 中,您可以只使用 aggregate function percentile_cont()
:
select
user_id,
percentile_cont(0.5) within group(order by total_per_user) median_total_per_user
from (
select user_id, sum(amount) total_per_user
from transactions
group by date_trunc('month', created_at), user_id
) t
group by user_id
请注意,date_trunc()
可能比 extract(month from ...)
更接近您想要的 - 除非您确实想将不同年份的同一个月的金额加在一起,这不是我理解您的要求的方式.
只需使用percentile_cont()
。我不完全理解这个问题。如果您想要每月支出的中位数,则:
SELECT user_id,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_per_user
ROW_NUMBER() over (ORDER BY total_per_user DESC) AS desc_total,
ROW_NUMBER() over (ORDER BY total_per_user ASC) AS asc_total
FROM (SELECT DATE_TRUNC('month', created_at) AS calendar_month,
user_id, SUM(amount) AS total_per_user
FROM transactions t
GROUP BY calendar_month, user_id
) um
GROUP BY user_id;
中位数有一个内置函数。不需要更复杂的处理。