如何在日期时间聚合 percentile_disc() 函数
How to aggregate percentile_disc() function over date time
我有如下表格:
recorddate score
2021-05-01 0
2021-05-01 1
2021-05-01 2
2021-05-02 3
2021-05-02 4
2021-05-03 5
2021-05-07 6
并且想要每周 score
获得第 60 个百分位数。我试过了:
select distinct
recorddate
, PERCENTILE_disc(0.60) WITHIN GROUP (ORDER BY score)
OVER (PARTITION BY recorddate) AS top60
from tbl;
它返回了这样的东西:
recorddate top60
2021-05-01 1
2021-05-02 4
2021-05-03 5
2021-05-07 6
但我想要的结果就像是每周汇总(7 天)。
例如,对于 2021-05-07 结束的一周:
recorddate top60
2021-05-01 ~ 2021-05-07 2
有解决办法吗?
我想你想要这个:
SELECT date_trunc('week', recorddate) AS week
, percentile_disc(0.60) WITHIN GROUP(ORDER BY score) AS top60
FROM tbl
GROUP BY 1;
这是每周(存在实际数据)第 60 个百分位数的离散值 - 其中同一组(一周内)中 60% 的行相同或更小。准确的说,用the manual的话来说:
the first value within the ordered set of aggregated argument values whose position in the ordering equals or exceeds the specified fraction.
在上面添加您的格式:
SELECT to_char(week_start, 'YYYY-MM-DD" ~ "')
|| to_char(week_start + interval '6 days', 'YYYY-MM-DD') AS week
, top60
FROM (
SELECT date_trunc('week', recorddate) AS week_start
, percentile_disc(0.60) WITHIN GROUP(ORDER BY score) AS top60
FROM tbl
GROUP BY 1
) sub;
我宁愿把它叫做“percentile_60”。
我有如下表格:
recorddate score
2021-05-01 0
2021-05-01 1
2021-05-01 2
2021-05-02 3
2021-05-02 4
2021-05-03 5
2021-05-07 6
并且想要每周 score
获得第 60 个百分位数。我试过了:
select distinct
recorddate
, PERCENTILE_disc(0.60) WITHIN GROUP (ORDER BY score)
OVER (PARTITION BY recorddate) AS top60
from tbl;
它返回了这样的东西:
recorddate top60
2021-05-01 1
2021-05-02 4
2021-05-03 5
2021-05-07 6
但我想要的结果就像是每周汇总(7 天)。 例如,对于 2021-05-07 结束的一周:
recorddate top60
2021-05-01 ~ 2021-05-07 2
有解决办法吗?
我想你想要这个:
SELECT date_trunc('week', recorddate) AS week
, percentile_disc(0.60) WITHIN GROUP(ORDER BY score) AS top60
FROM tbl
GROUP BY 1;
这是每周(存在实际数据)第 60 个百分位数的离散值 - 其中同一组(一周内)中 60% 的行相同或更小。准确的说,用the manual的话来说:
the first value within the ordered set of aggregated argument values whose position in the ordering equals or exceeds the specified fraction.
在上面添加您的格式:
SELECT to_char(week_start, 'YYYY-MM-DD" ~ "')
|| to_char(week_start + interval '6 days', 'YYYY-MM-DD') AS week
, top60
FROM (
SELECT date_trunc('week', recorddate) AS week_start
, percentile_disc(0.60) WITHIN GROUP(ORDER BY score) AS top60
FROM tbl
GROUP BY 1
) sub;
我宁愿把它叫做“percentile_60”。