如何在两个表的最小和最大时间戳之间生成序列 (Postgresql)

Question

我有两个 table 有经常更新的天气数据。 Table A 的数据间隔为 10 分钟，table B 的间隔为 1 小时。

Table A（实际天气）

observationtime	temperature
17/02/21 00:00	9
17/02/21 00:10	9
17/02/21 00:20	9
17/02/21 00:30	9
...	...
17/02/21 03:00	9

TableB（天气预报）

observationtime	temperature
17/02/21 04:00	9
17/02/21 05:00	9
17/02/21 06:00	9
17/02/21 07:00	9

我想要的

observationtime	realized_temperature	forecasted_temperature
17/02/21 00:00	9
17/02/21 01:00	9
17/02/21 02:00	9
17/02/21 03:00	9
17/02/21 04:00		9
17/02/21 05:00		9
17/02/21 06:00		9
17/02/21 07:00		9

据我所知，需要发生三件事：

首先，我需要从 table A 中获取最小的时间戳，并将其四舍五入为最小的整小时
从预测中获取最大时间戳table
以 1 小时为间隔生成这两个时间戳之间的序列
在生成的系列上加入Table A 和B

不太明白该怎么做。有人有解决方案吗？

Answer 1

我认为不需要 generate_series():

demo:db<>fiddle

SELECT
    COALESCE(r.observationtime, fc.observationtime) as observationtime,     -- 3
    r.temperature as realized_temperature,
    fc.temperature as forecasted_temperature
FROM (
    SELECT DISTINCT ON (date_trunc('hour', observationtime))                -- 1
        *
    FROM r
    ORDER BY date_trunc('hour', observationtime), observationtime
) r
FULL JOIN fc ON r.observationtime = fc.observationtime                      -- 2
ORDER BY 1

首先从realizedtable中提取每个每小时的第一条记录。这可以使用 DISTINCT ON 来完成，其中 return 是有序组的第一条记录。这里你的组是小时（使用 date_trunc() 你可以将 hh:10 - hh:50 值转换为完整的小时以加入组）。
使用 FULL JOIN：这会加入 table，即使没有匹配的时间戳
使用 COALESCE() 到 return 列表中的第一个非 NULL 值。所以，如果有 realized 温度，这将被采用，否则预测。

如何在两个表的最小和最大时间戳之间生成序列 (Postgresql)

How to generate series between the min and max timestamps of two tables (Postgresql)

postgresql

generate-series