使用 window 函数和过滤器 PostgreSQL 按时间条件计算移动 sum/count
Calculate moving sum/count by time condition using window function and filter PostgreSQL
我想计算第 30 天行中前 29 天的总和,我使用过滤器和 window 函数但 FILTER
不起作用,
如果我使用:
它仍然从头到尾求和
Select *, Sum(quantity) filter (where time between time - interval '29 day' and time) over ()
from t1
如果我使用它显示空列:
Select *, Sum(quantity) filter (where time between time - interval '29 day' and time - interval '1 day') over ()
from t1
数据,为了简单起见,我减少了列数
Time sum_quantity
2020-01-01 1
2020-01-02 2
2020-01-03 3
2020-01-04 6
....
2020-01-30 100
数据类型:时间为日期,数量为整数
想要的结果:
应与第一个 table 具有相同的列并添加此移动总和列
第30天=第1天到第29天的总量,每30天
如何解决这个问题
您的 filter (where)
子句始终为真且为空 over()
子句 window 跨越所有结果集。
您应该在 over
子句中指定 window,而不是 filter
子句。可能你需要像
这样的东西
sum(quantity) over (order by time rows between 29 preceding and current row)
或更好range between...
.
请在 where 子句中使用条件 因为您使用 windows 函数,所以就像条件表达式一样:
SUM(<expression>) FILTER(WHERE <condition>)
SUM(CASE WHEN <condition> THEN <expression> END)
这是你想要的吗:
select m1.Time
, (select sum(sum_quantity)
from mytable m
where m.time between (m1.time - interval '29 day') and (m1.time)) sum_total
from mytable m1
group by m1.Time
order by m1.Time;
或者这样更好:
select m1.Time
, sum(m.sum_quantity)
from mytable m
join mytable m1 on m.time between (m1.time - interval '29 day') and (m1.time)
group by m1.Time
order by m1.Time;
这是一个演示:
你想要一个 window 函数和 window 框架定义,使用 range
:
select t1.*,
sum(quantity) over (order by time
range between interval '29 day' preceding and current row
)
from t1 ;
编辑:
如果您有所有日期的数据,您可以使用 rows
:
select t1.*,
sum(quantity) over (order by time
rows between 29 preceding and current row
)
from t1 ;
编辑二:
如果您需要处理不支持 range
的旧版本 Postgres 中的缺失日期,那么扩展数据可能是最简单的方法:
select t1.*,
sum(quantity) over (order by time
rows between 29 preceding and current row
)
from (select generate_series(min(t1.time), max(t1.time), interval '1 day') as dte
from t1
) d left join
t1
on d.dte = t1.time;
您可能想要过滤掉额外的行:
select t1.*
from (select t1.*,
sum(quantity) over (order by time
rows between 29 preceding and current row
) as running_sum
from (select generate_series(min(t1.time), max(t1.time), interval '1 day') as dte
from t1
) d left join
t1
on d.dte = t1.time
) t1
where t1.time is not null;
我想计算第 30 天行中前 29 天的总和,我使用过滤器和 window 函数但 FILTER
不起作用,
如果我使用:
它仍然从头到尾求和Select *, Sum(quantity) filter (where time between time - interval '29 day' and time) over ()
from t1
如果我使用它显示空列:
Select *, Sum(quantity) filter (where time between time - interval '29 day' and time - interval '1 day') over ()
from t1
数据,为了简单起见,我减少了列数
Time sum_quantity
2020-01-01 1
2020-01-02 2
2020-01-03 3
2020-01-04 6
....
2020-01-30 100
数据类型:时间为日期,数量为整数
想要的结果: 应与第一个 table 具有相同的列并添加此移动总和列
第30天=第1天到第29天的总量,每30天
如何解决这个问题
您的 filter (where)
子句始终为真且为空 over()
子句 window 跨越所有结果集。
您应该在 over
子句中指定 window,而不是 filter
子句。可能你需要像
sum(quantity) over (order by time rows between 29 preceding and current row)
或更好range between...
.
请在 where 子句中使用条件 因为您使用 windows 函数,所以就像条件表达式一样:
SUM(<expression>) FILTER(WHERE <condition>)
SUM(CASE WHEN <condition> THEN <expression> END)
这是你想要的吗:
select m1.Time
, (select sum(sum_quantity)
from mytable m
where m.time between (m1.time - interval '29 day') and (m1.time)) sum_total
from mytable m1
group by m1.Time
order by m1.Time;
或者这样更好:
select m1.Time
, sum(m.sum_quantity)
from mytable m
join mytable m1 on m.time between (m1.time - interval '29 day') and (m1.time)
group by m1.Time
order by m1.Time;
这是一个演示:
你想要一个 window 函数和 window 框架定义,使用 range
:
select t1.*,
sum(quantity) over (order by time
range between interval '29 day' preceding and current row
)
from t1 ;
编辑:
如果您有所有日期的数据,您可以使用 rows
:
select t1.*,
sum(quantity) over (order by time
rows between 29 preceding and current row
)
from t1 ;
编辑二:
如果您需要处理不支持 range
的旧版本 Postgres 中的缺失日期,那么扩展数据可能是最简单的方法:
select t1.*,
sum(quantity) over (order by time
rows between 29 preceding and current row
)
from (select generate_series(min(t1.time), max(t1.time), interval '1 day') as dte
from t1
) d left join
t1
on d.dte = t1.time;
您可能想要过滤掉额外的行:
select t1.*
from (select t1.*,
sum(quantity) over (order by time
rows between 29 preceding and current row
) as running_sum
from (select generate_series(min(t1.time), max(t1.time), interval '1 day') as dte
from t1
) d left join
t1
on d.dte = t1.time
) t1
where t1.time is not null;