使用 window 函数和过滤器 PostgreSQL 按时间条件计算移动 sum/count

Calculate moving sum/count by time condition using window function and filter PostgreSQL

我想计算第 30 天行中前 29 天的总和,我使用过滤器和 window 函数但 FILTER 不起作用,

如果我使用:

它仍然从头到尾求和
Select *, Sum(quantity) filter (where time between time - interval '29 day' and time) over ()
from t1 

如果我使用它显示空列:

Select *, Sum(quantity) filter (where time between time - interval '29 day' and time - interval '1 day') over ()
from t1

数据,为了简单起见,我减少了列数

Time        sum_quantity
2020-01-01  1
2020-01-02  2
2020-01-03  3
2020-01-04  6
    ....
2020-01-30  100

数据类型:时间为日期,数量为整数

想要的结果: 应与第一个 table 具有相同的列并添加此移动总和列

第30天=第1天到第29天的总量,每30天

如何解决这个问题

您的 filter (where) 子句始终为真且为空 over() 子句 window 跨越所有结果集。

您应该在 over 子句中指定 window,而不是 filter 子句。可能你需要像

这样的东西
sum(quantity) over (order by time rows between 29 preceding and current row)

或更好range between....

请在 where 子句中使用条件 因为您使用 windows 函数,所以就像条件表达式一样:

SUM(<expression>) FILTER(WHERE <condition>)
SUM(CASE WHEN <condition> THEN <expression> END)

这是你想要的吗:

select m1.Time
       , (select sum(sum_quantity) 
          from mytable m
          where m.time between (m1.time - interval '29 day') and (m1.time)) sum_total
from mytable m1
group by m1.Time
order by m1.Time;

或者这样更好:

select m1.Time
       , sum(m.sum_quantity) 
from mytable m
     join mytable m1 on m.time between (m1.time - interval '29 day') and (m1.time)
group by m1.Time
order by m1.Time;

这是一个演示:

DEMO

你想要一个 window 函数和 window 框架定义,使用 range:

select t1.*,
       sum(quantity) over (order by time
                           range between interval '29 day' preceding and current row
                          ) 
from t1 ;

编辑:

如果您有所有日期的数据,您可以使用 rows:

select t1.*,
       sum(quantity) over (order by time
                           rows between 29 preceding and current row
                          ) 
from t1 ;

编辑二:

如果您需要处理不支持 range 的旧版本 Postgres 中的缺失日期,那么扩展数据可能是最简单的方法:

select t1.*,
       sum(quantity) over (order by time
                           rows between 29 preceding and current row
                           ) 
from (select generate_series(min(t1.time), max(t1.time), interval '1 day') as dte
      from t1
     ) d left join
     t1
     on d.dte = t1.time;

您可能想要过滤掉额外的行:

select t1.*
from (select t1.*,
             sum(quantity) over (order by time
                                 rows between 29 preceding and current row
                                 ) as running_sum
      from (select generate_series(min(t1.time), max(t1.time), interval '1 day') as dte
            from t1
           ) d left join
           t1
           on d.dte = t1.time
     ) t1
where t1.time is not null;