MariaDB 中的百分位数
Percentiles in MariaDB
我试图在 MariaDB 10.4.11 中找到第 25 个和第 75 个百分位数,根据 https://mariadb.com/kb/en/percentile_cont/ 我相信下面的代码是正确的方法,但是 returns每次计算的结果相同?
select name,
percentile_cont(0.25) within group (order by sell_price) over (partition by name) as percentile_25,
percentile_cont(0.5) within group (order by sell_price) over (partition by name) as median,
percentile_cont(0.75) within group (order by sell_price) over (partition by name) as percentile_75
from commodity
group by name;
示例数据;
market_id name sell_price
3223191296 beer 175
128081144 beer 175
3225577472 beer 338
3228907520 beer 409
128666762 beer 600
3223210496 beer 646
3543674368 beer 647
3543674368 beer 647
3227117312 beer 690
3224189696 beer 704
3227711744 beer 709
128754255 beer 756
3223191296 coffee 1286
128081144 coffee 1286
3228907520 coffee 1601
3225577472 coffee 1694
128666762 coffee 1703
128754255 coffee 1842
3223210496 coffee 1892
3227117312 coffee 1928
3227711744 coffee 1956
3224189696 coffee 1965
3543674368 coffee 2245
3223891456 coffee 2733
3223891456 beer 4431
预期结果(虚构);
name percentile_25 median percentile_75
beer 338 646 704
coffee 1694 1892 2245
PERCENTILE_CONT
函数是一个 window 函数,因此应用于整个结果集。您可以通过按名称聚合并取每个表达式的最大值来获得所需的输出:
SELECT
name,
MAX(percentile_25) AS percentile_25,
MAX(median) AS median,
MAX(percentile_75) AS percentile_75
FROM
(
SELECT
name,
PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY sell_price) OVER (PARTITION BY name) AS percentile_25,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sell_price) OVER (PARTITION BY name) AS median,
PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY sell_price) OVER (PARTITION BY name) AS percentile_75
FROM commodity
) t
GROUP BY name;
percentile_cont()
是一个 window 函数而不是聚合函数。
一个简单的解决方案是使用 select distinct
而不是 group by
:
select distinct name,
percentile_cont(0.25) within group (order by sell_price) over (partition by name) as percentile_25,
percentile_cont(0.50) within group (order by sell_price) over (partition by name) as median,
percentile_cont(0.75) within group (order by sell_price) over (partition by name) as percentile_75
from commodity;
我试图在 MariaDB 10.4.11 中找到第 25 个和第 75 个百分位数,根据 https://mariadb.com/kb/en/percentile_cont/ 我相信下面的代码是正确的方法,但是 returns每次计算的结果相同?
select name,
percentile_cont(0.25) within group (order by sell_price) over (partition by name) as percentile_25,
percentile_cont(0.5) within group (order by sell_price) over (partition by name) as median,
percentile_cont(0.75) within group (order by sell_price) over (partition by name) as percentile_75
from commodity
group by name;
示例数据;
market_id name sell_price
3223191296 beer 175
128081144 beer 175
3225577472 beer 338
3228907520 beer 409
128666762 beer 600
3223210496 beer 646
3543674368 beer 647
3543674368 beer 647
3227117312 beer 690
3224189696 beer 704
3227711744 beer 709
128754255 beer 756
3223191296 coffee 1286
128081144 coffee 1286
3228907520 coffee 1601
3225577472 coffee 1694
128666762 coffee 1703
128754255 coffee 1842
3223210496 coffee 1892
3227117312 coffee 1928
3227711744 coffee 1956
3224189696 coffee 1965
3543674368 coffee 2245
3223891456 coffee 2733
3223891456 beer 4431
预期结果(虚构);
name percentile_25 median percentile_75
beer 338 646 704
coffee 1694 1892 2245
PERCENTILE_CONT
函数是一个 window 函数,因此应用于整个结果集。您可以通过按名称聚合并取每个表达式的最大值来获得所需的输出:
SELECT
name,
MAX(percentile_25) AS percentile_25,
MAX(median) AS median,
MAX(percentile_75) AS percentile_75
FROM
(
SELECT
name,
PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY sell_price) OVER (PARTITION BY name) AS percentile_25,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sell_price) OVER (PARTITION BY name) AS median,
PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY sell_price) OVER (PARTITION BY name) AS percentile_75
FROM commodity
) t
GROUP BY name;
percentile_cont()
是一个 window 函数而不是聚合函数。
一个简单的解决方案是使用 select distinct
而不是 group by
:
select distinct name,
percentile_cont(0.25) within group (order by sell_price) over (partition by name) as percentile_25,
percentile_cont(0.50) within group (order by sell_price) over (partition by name) as median,
percentile_cont(0.75) within group (order by sell_price) over (partition by name) as percentile_75
from commodity;