是否可以使用 TimescaleDB 连续聚合计算累积总和或移动平均值？

Question

考虑一个包含 2 列的 table：

create table foo
(
    ts             timestamp,
    precipitation  numeric,
    primary key (ts)
);

具有以下数据：

ts	precipitation
2021-06-01 12:00:00	1
2021-06-01 13:00:00	0
2021-06-01 14:00:00	2
2021-06-01 15:00:00	3

我想使用 TimescaleDB continuous aggregate 来计算该数据的三个小时累计总和，该数据每小时计算一次。使用上面的示例数据，我的连续聚合将包含

ts	cum_precipitation
2021-06-01 12:00:00	1
2021-06-01 13:00:00	1
2021-06-01 14:00:00	3
2021-06-01 15:00:00	5

我看不到使用支持的连续聚合语法执行此操作的方法。我错过了什么吗？本质上，我希望时间桶是前面的 x 小时，但计算每小时发生一次。

Answer 1

问得好！

您可以通过计算正常的连续聚合然后对其进行 window function 来实现。因此，每小时计算一个 sum()，然后执行 sum()，因为 window 函数可以工作。

当您遇到更复杂的聚合，例如平均值或标准差或百分位数近似值等时，我建议您切换到我们最近介绍的一些 two-step aggregates we introduced in the TimescaleDB Toolkit. Specifically, I'd look into the statistical aggregates。他们也可以做这种累积总和类型的事情。（他们只能使用双精度或可以转换为那种的东西 - 即 FLOAT，我强烈建议你不要使用 NUMERIC 而是切换到双精度或浮点数，不会看来你真的需要无限精度的计算）。

你可以看看我在 this presentation 中写的一些查询，但它可能看起来像：

CREATE MATERIALIZED VIEW response_times_five_min
WITH (timescaledb.continuous)
AS SELECT api_id,
    time_bucket('1 hour'::interval, ts) as bucket,
    stats_agg(response_time)
FROM response_times
GROUP BY 1, 2;

SELECT bucket, 
    average(rolling(stats_agg) OVER last3), 
    sum(rolling(stats_agg) OVER last3)
FROM response_times_five_min
WHERE api_id = 32
WINDOW last3 as 
(ORDER BY bucket RANGE '3 hours' PRECEDING);

是否可以使用 TimescaleDB 连续聚合计算累积总和或移动平均值？

Is it possible to calculate a cumulative sum or moving average with a TimescaleDB continuous aggregate?

sql

postgresql

timescaledb