如何使用 PERCENTILE_CONT 和 GROUP BY id 计算单位价格中值

How to Calculate Median Price Per Unit Using PERCENTILE_CONT and GROUP BY id

我正在使用 postgres 9.5 并尝试计算 中位数平均 价格单元 GROUP BY id。这是 DBFIDDLE

中的查询

这是数据

id   | price | units
-----+-------+--------
1    |  100  | 15
1    |  90   | 10
1    |  50   |  8
1    |  40   |  8
1    |  30   |  7
2    |  110  | 22
2    |  60   |  8
2    |  50   | 11

使用 percentile_cont 这是我的查询:

SELECT id,
  ceil(avg(price)) as avg_price,
  percentile_cont(0.5) within group (order by price) as median_price,
  ceil( sum (price) / sum (units) ) AS avg_pp_unit,
  ceil( percentile_cont(0.5) within group (order by price)  / 
        percentile_cont(0.5) within group (order by units) ) as median_pp_unit
FROM t
GROUP by id

这个查询returns:

id| avg_price | median_price | avg_pp_unit  | median_pp_unit 
--+-----------+--------------+--------------+---------------
1 |   62      |     50       |      6       |      7 
2 |   74      |     60       |      5       |      5

我很确定平均计算是正确的。这是计算 单位价格中值 的正确方法吗?

这 post 表明这是正确的(尽管性能很差)但我很好奇中位数计算中的除法是否会扭曲结果。

Calculating median with PERCENTILE_CONT and grouping

The median is the value separating the higher half from the lower half of a data sample (a population or a probability distribution). For a data set, it may be thought of as the "middle" value. https://en.wikipedia.org/wiki/Median

所以你的中位数价格是 55,中位数单位是 9

        Sort by price                  Sort by units
  id    |   price   |  units |  | id    |  price  |   units  
 -------|-----------|--------|  |-------|---------|---------- 
      1 | 30        |      7 |  |     1 |      30 | 7        
      1 | 40        |      8 |  |     1 |      40 | 8        
      1 | 50        |      8 |  |     1 |      50 | 8        
 >>>  2 | 50        |     11 |  |     2 |      60 | 8    <<<<    
 >>>  2 | 60        |      8 |  |     1 |      90 | 10   <<<<
      1 | 90        |     10 |  |     2 |      50 | 11       
      1 | 100       |     15 |  |     1 |     100 | 15       
      2 | 110       |     22 |  |     2 |     110 | 22       
        |           |        |  |       |         |          
         (50+60)/2                               (8+10)/2 
          55                                        9        

我不确定你打算做什么 "median price per unit":

CREATE TABLE t(
   id    INTEGER  NOT NULL
  ,price INTEGER  NOT NULL
  ,units INTEGER  NOT NULL
);
INSERT INTO t(id,price,units) VALUES (1,30,7);
INSERT INTO t(id,price,units) VALUES (1,40,8);
INSERT INTO t(id,price,units) VALUES (1,50,8);
INSERT INTO t(id,price,units) VALUES (2,50,11);
INSERT INTO t(id,price,units) VALUES (2,60,8);
INSERT INTO t(id,price,units) VALUES (1,90,10);
INSERT INTO t(id,price,units) VALUES (1,100,15);
INSERT INTO t(id,price,units) VALUES (2,110,22);

SELECT
       percentile_cont(0.5) WITHIN GROUP (ORDER BY price) med_price
     , percentile_cont(0.5) WITHIN GROUP (ORDER BY units) med_units
FROM
  t;

     | med_price | med_units 
 ----|-----------|----------- 
   1 |        55 |         9 

如果 "price" 列表示 "unit price",那么您不需要将 55 除以 9,但如果 "price" 是 "order total",则您需要除以单位:55/9 = 6.11