从小时和分钟开始 time_bucket_gapfill
start from hours and minutes with time_bucket_gapfill
我有下一个预测table:
| id | timestamp | name | temp |
------------------------------------------------------------------
| 1 | 2022-01-16 12:40:06 | Bancal 1 | 22 |
| 2 | 2022-01-16 12:58:05 | Bancal 1 | 21 |
| 3 | 2022-01-16 13:22:00 | Bancal 1 | 30 |
| 4 | 2022-01-16 13:30:20 | Bancal 1 | 10 |
| 5 | 2022-01-16 13:59:06 | Bancal 1 | 15 |
| 6 | 2022-01-16 15:40:00 | Bancal 2 | 15 |
| 7 | 2022-01-16 15:54:06 | Bancal 1 | 18 |
| 8 | 2022-01-17 10:30:05 | Bancal 2 | 23 |
| 9 | 2022-01-17 11:20:00 | Bancal 1 | 12 |
| 10 | 2022-01-17 11:32:07 | Bancal 3 | 28 |
| 11 | 2022-01-17 13:30:06 | Bancal 1 | 23 |
我想以 1 小时为间隔进行查询并填充空格,但我希望它在指定的小时和分钟开始,如果我说从日期时间开始 '2022-01-16 12:38:52'
那么1小时的间隔应该是:
2022-01-16 12:38:52
2022-01-16 13:38:52
2022-01-16 14:38:52
2022-01-16 15:38:52
.
.
.
2022-01-17 09:38:52
2022-01-17 10:38:52
2022-01-17 11:38:52
2022-01-17 12:38:52
2022-01-17 13:38:52
使用 timescaledb 的 time_bucket_gapfill 函数,但间隙是在小时开始时制作的:
SELECT time_bucket_gapfill(interval '1 hour', timestamp,) AS init,
name,
avg(temp) AS avg_temp
FROM forecast
WHERE timestamp >= '2022-01-16 12:38:52' AND timestamp<= '2022-01-17 13:38:52'
GROUP BY name, init
ORDER BY init;
| init | name | avg_temp
2022-01-16 12:00:00.000000 | Bancal 2 |
2022-01-16 12:00:00.000000 | Bancal 1 | 21.5
2022-01-16 12:00:00.000000 | Bancal 3 |
2022-01-16 13:00:00.000000 | Bancal 1 | 18.3333333333333333
2022-01-16 13:00:00.000000 | Bancal 3 |
2022-01-16 13:00:00.000000 | Bancal 2 |
2022-01-16 14:00:00.000000 | Bancal 3 |
2022-01-16 14:00:00.000000 | Bancal 1 |
2022-01-16 14:00:00.000000 | Bancal 2 |
2022-01-16 15:00:00.000000 | Bancal 2 | 15
2022-01-16 15:00:00.000000 | Bancal 1 | 18
2022-01-16 15:00:00.000000 | Bancal 3 |
...
2022-01-17 09:00:00.000000 | Bancal 1 |
2022-01-17 10:00:00.000000 | Bancal 2 | 23
2022-01-17 10:00:00.000000 | Bancal 3 |
2022-01-17 10:00:00.000000 | Bancal 1 |
2022-01-17 11:00:00.000000 | Bancal 2 |
2022-01-17 11:00:00.000000 | Bancal 1 | 12
2022-01-17 11:00:00.000000 | Bancal 3 | 28
2022-01-17 12:00:00.000000 | Bancal 2 |
2022-01-17 12:00:00.000000 | Bancal 1 |
2022-01-17 12:00:00.000000 | Bancal 3 |
2022-01-17 13:00:00.000000 | Bancal 3 |
2022-01-17 13:00:00.000000 | Bancal 1 | 23
2022-01-17 13:00:00.000000 | Bancal 2 |
avg 的结果出乎意料,因为它从 '2022-01-16 12:00:00' to '2022-01-16 13:00:00'
而不是 '2022-01-16 12:38: 52' to '2022-01-16 13:38:52'
获取数据
time_bucket_gapfill 有办法弥补这些差距吗?
预期:
| init | name | avg_temp
2022-01-16 12:38:00.000000 | Bancal 2 |
2022-01-16 12:38:00.000000 | Bancal 1 | 20.75
2022-01-16 12:38:00.000000 | Bancal 3 |
2022-01-16 13:38:00.000000 | Bancal 1 | 15
2022-01-16 13:38:00.000000 | Bancal 3 |
2022-01-16 13:38:00.000000 | Bancal 2 |
2022-01-16 14:38:00.000000 | Bancal 3 |
2022-01-16 14:38:00.000000 | Bancal 1 |
2022-01-16 14:38:00.000000 | Bancal 2 |
2022-01-16 15:38:00.000000 | Bancal 2 | 15
2022-01-16 15:38:00.000000 | Bancal 1 | 18
2022-01-16 15:38:00.000000 | Bancal 3 |
...
2022-01-17 09:38:00.000000 | Bancal 2 | 23
2022-01-17 10:38:00.000000 | Bancal 2 |
2022-01-17 10:38:00.000000 | Bancal 3 | 28
2022-01-17 10:38:00.000000 | Bancal 1 | 12
2022-01-17 11:38:00.000000 | Bancal 2 |
2022-01-17 11:38:00.000000 | Bancal 1 |
2022-01-17 11:38:00.000000 | Bancal 3 |
2022-01-17 12:38:00.000000 | Bancal 2 |
2022-01-17 12:38:00.000000 | Bancal 1 | 23
2022-01-17 12:38:00.000000 | Bancal 3 |
2022-01-17 13:38:00.000000 | Bancal 3 |
2022-01-17 13:38:00.000000 | Bancal 1 |
2022-01-17 13:38:00.000000 | Bancal 2 |
我会使用 generate_series 函数。
generate_series(start, stop, step interval)
第三个参数你可以写你的期望区间。在你的情况下可能是 1 hours
SELECT *
FROM generate_series('2022-01-16 12:38:52'::timestamp,'2022-01-17 13:38:52'::timestamp,'1 hours') v
编辑
你可以尝试用CTE或者子查询做一个日历来表示每个name
的时间间隔,然后用LEAD
window函数得到[=15的时间戳间隔=]条件。
WITH CTE AS (
SELECT DISTINCT
name,
generate_series('2022-01-16 12:38:52'::timestamp,'2022-01-17 13:38:52'::timestamp,'1 hours') dt
FROM forecast
)
SELECT t1.name init,
t1.dt,
avg(coalesce(t2.temp,0)) AS avg_temp
FROM (
SELECT *,LEAD(dt) OVER(PARTITION BY name ORDER BY dt) n_dt
FROM CTE
) t1
LEFT JOIN forecast t2
ON t1.name = t2.name AND t2.timestamp BETWEEN t1.dt AND t1.n_dt
GROUP BY t1.name,
t1.dt
ORDER BY t1.dt
我有下一个预测table:
| id | timestamp | name | temp |
------------------------------------------------------------------
| 1 | 2022-01-16 12:40:06 | Bancal 1 | 22 |
| 2 | 2022-01-16 12:58:05 | Bancal 1 | 21 |
| 3 | 2022-01-16 13:22:00 | Bancal 1 | 30 |
| 4 | 2022-01-16 13:30:20 | Bancal 1 | 10 |
| 5 | 2022-01-16 13:59:06 | Bancal 1 | 15 |
| 6 | 2022-01-16 15:40:00 | Bancal 2 | 15 |
| 7 | 2022-01-16 15:54:06 | Bancal 1 | 18 |
| 8 | 2022-01-17 10:30:05 | Bancal 2 | 23 |
| 9 | 2022-01-17 11:20:00 | Bancal 1 | 12 |
| 10 | 2022-01-17 11:32:07 | Bancal 3 | 28 |
| 11 | 2022-01-17 13:30:06 | Bancal 1 | 23 |
我想以 1 小时为间隔进行查询并填充空格,但我希望它在指定的小时和分钟开始,如果我说从日期时间开始 '2022-01-16 12:38:52'
那么1小时的间隔应该是:
2022-01-16 12:38:52
2022-01-16 13:38:52
2022-01-16 14:38:52
2022-01-16 15:38:52
.
.
.
2022-01-17 09:38:52
2022-01-17 10:38:52
2022-01-17 11:38:52
2022-01-17 12:38:52
2022-01-17 13:38:52
使用 timescaledb 的 time_bucket_gapfill 函数,但间隙是在小时开始时制作的:
SELECT time_bucket_gapfill(interval '1 hour', timestamp,) AS init,
name,
avg(temp) AS avg_temp
FROM forecast
WHERE timestamp >= '2022-01-16 12:38:52' AND timestamp<= '2022-01-17 13:38:52'
GROUP BY name, init
ORDER BY init;
| init | name | avg_temp
2022-01-16 12:00:00.000000 | Bancal 2 |
2022-01-16 12:00:00.000000 | Bancal 1 | 21.5
2022-01-16 12:00:00.000000 | Bancal 3 |
2022-01-16 13:00:00.000000 | Bancal 1 | 18.3333333333333333
2022-01-16 13:00:00.000000 | Bancal 3 |
2022-01-16 13:00:00.000000 | Bancal 2 |
2022-01-16 14:00:00.000000 | Bancal 3 |
2022-01-16 14:00:00.000000 | Bancal 1 |
2022-01-16 14:00:00.000000 | Bancal 2 |
2022-01-16 15:00:00.000000 | Bancal 2 | 15
2022-01-16 15:00:00.000000 | Bancal 1 | 18
2022-01-16 15:00:00.000000 | Bancal 3 |
...
2022-01-17 09:00:00.000000 | Bancal 1 |
2022-01-17 10:00:00.000000 | Bancal 2 | 23
2022-01-17 10:00:00.000000 | Bancal 3 |
2022-01-17 10:00:00.000000 | Bancal 1 |
2022-01-17 11:00:00.000000 | Bancal 2 |
2022-01-17 11:00:00.000000 | Bancal 1 | 12
2022-01-17 11:00:00.000000 | Bancal 3 | 28
2022-01-17 12:00:00.000000 | Bancal 2 |
2022-01-17 12:00:00.000000 | Bancal 1 |
2022-01-17 12:00:00.000000 | Bancal 3 |
2022-01-17 13:00:00.000000 | Bancal 3 |
2022-01-17 13:00:00.000000 | Bancal 1 | 23
2022-01-17 13:00:00.000000 | Bancal 2 |
avg 的结果出乎意料,因为它从 '2022-01-16 12:00:00' to '2022-01-16 13:00:00'
而不是 '2022-01-16 12:38: 52' to '2022-01-16 13:38:52'
获取数据
time_bucket_gapfill 有办法弥补这些差距吗?
预期:
| init | name | avg_temp
2022-01-16 12:38:00.000000 | Bancal 2 |
2022-01-16 12:38:00.000000 | Bancal 1 | 20.75
2022-01-16 12:38:00.000000 | Bancal 3 |
2022-01-16 13:38:00.000000 | Bancal 1 | 15
2022-01-16 13:38:00.000000 | Bancal 3 |
2022-01-16 13:38:00.000000 | Bancal 2 |
2022-01-16 14:38:00.000000 | Bancal 3 |
2022-01-16 14:38:00.000000 | Bancal 1 |
2022-01-16 14:38:00.000000 | Bancal 2 |
2022-01-16 15:38:00.000000 | Bancal 2 | 15
2022-01-16 15:38:00.000000 | Bancal 1 | 18
2022-01-16 15:38:00.000000 | Bancal 3 |
...
2022-01-17 09:38:00.000000 | Bancal 2 | 23
2022-01-17 10:38:00.000000 | Bancal 2 |
2022-01-17 10:38:00.000000 | Bancal 3 | 28
2022-01-17 10:38:00.000000 | Bancal 1 | 12
2022-01-17 11:38:00.000000 | Bancal 2 |
2022-01-17 11:38:00.000000 | Bancal 1 |
2022-01-17 11:38:00.000000 | Bancal 3 |
2022-01-17 12:38:00.000000 | Bancal 2 |
2022-01-17 12:38:00.000000 | Bancal 1 | 23
2022-01-17 12:38:00.000000 | Bancal 3 |
2022-01-17 13:38:00.000000 | Bancal 3 |
2022-01-17 13:38:00.000000 | Bancal 1 |
2022-01-17 13:38:00.000000 | Bancal 2 |
我会使用 generate_series 函数。
generate_series(start, stop, step interval)
第三个参数你可以写你的期望区间。在你的情况下可能是 1 hours
SELECT *
FROM generate_series('2022-01-16 12:38:52'::timestamp,'2022-01-17 13:38:52'::timestamp,'1 hours') v
编辑
你可以尝试用CTE或者子查询做一个日历来表示每个name
的时间间隔,然后用LEAD
window函数得到[=15的时间戳间隔=]条件。
WITH CTE AS (
SELECT DISTINCT
name,
generate_series('2022-01-16 12:38:52'::timestamp,'2022-01-17 13:38:52'::timestamp,'1 hours') dt
FROM forecast
)
SELECT t1.name init,
t1.dt,
avg(coalesce(t2.temp,0)) AS avg_temp
FROM (
SELECT *,LEAD(dt) OVER(PARTITION BY name ORDER BY dt) n_dt
FROM CTE
) t1
LEFT JOIN forecast t2
ON t1.name = t2.name AND t2.timestamp BETWEEN t1.dt AND t1.n_dt
GROUP BY t1.name,
t1.dt
ORDER BY t1.dt