如何计算Postgres中的中位数?

How to calculate the median in Postgres?

我已经创建了一个基本数据库(附图片)Database,我正在尝试查找以下内容:

"Median total amount spent per user in each calendar month"

我尝试了以下方法,但出现错误:

SELECT 
user_id,
AVG(total_per_user)
FROM (SELECT user_id,
        ROW_NUMBER() over (ORDER BY total_per_user DESC) AS desc_total,
        ROW_NUMBER() over (ORDER BY total_per_user ASC) AS asc_total
      FROM (SELECT EXTRACT(MONTH FROM created_at) AS calendar_month,
            user_id,    
            SUM(amount) AS total_per_user
            FROM transactions
            GROUP BY calendar_month, user_id) AS total_amount   
      ORDER BY user_id) AS a
WHERE asc_total IN (desc_total, desc_total+1, desc_total-1)
GROUP BY user_id
;

在 Postgres 中,您可以只使用 aggregate function percentile_cont():

select 
    user_id,
    percentile_cont(0.5) within group(order by total_per_user) median_total_per_user
from (
    select user_id, sum(amount) total_per_user
    from transactions
    group by date_trunc('month', created_at), user_id
) t
group by user_id

请注意,date_trunc() 可能比 extract(month from ...) 更接近您想要的 - 除非您确实想将不同年份的同一个月的金额加在一起,这不是我理解您的要求的方式.

只需使用percentile_cont()。我不完全理解这个问题。如果您想要每月支出的中位数,则:

SELECT user_id,
       PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_per_user
        ROW_NUMBER() over (ORDER BY total_per_user DESC) AS desc_total,
        ROW_NUMBER() over (ORDER BY total_per_user ASC) AS asc_total
FROM (SELECT DATE_TRUNC('month', created_at) AS calendar_month,
             user_id, SUM(amount) AS total_per_user
      FROM transactions t
      GROUP BY calendar_month, user_id
     ) um   
GROUP BY user_id;

中位数有一个内置函数。不需要更复杂的处理。