Postgresql比较连续的行并在没有值的情况下插入相同的行
Postgresql compare consecutive rows and insert identical row if there are no values
我有一个 table 每 15 分钟给我一次数据,我需要那个时间范围。
我注意到有时我没有 3/4 小时的数据,但我需要复制最后一行,但缺少时间戳。
示例:
product_id
total_revenue
timestamp
1
50
01-01-2021 00:00:00
2
17
01-01-2021 00:00:00
3
30
01-01-2021 00:00:00
1
67
01-01-2021 00:15:00
2
31
01-01-2021 00:15:00
1
67
01-01-2021 00:30:00
2
31
01-01-2021 00:30:00
3
33
01-01-2021 00:30:00
但我需要这样的输出:
product_id
total_revenue
timestamp
1
50
01-01-2021 00:00:00
2
17
01-01-2021 00:00:00
3
30
01-01-2021 00:00:00
1
67
01-01-2021 00:15:00
2
31
01-01-2021 00:15:00
3
30
01-01-2021 00:15:00
1
67
01-01-2021 00:30:00
2
31
01-01-2021 00:30:00
3
33
01-01-2021 00:30:00
我有一个 select 声明,例如:
select product_id,total_revenue,时间戳
来自收入
(我也计算了两个连续行之间的差异)。
有人知道如何帮助我吗?
一种方法使用 generate_series()
和 lead()
:
with tt as (
select product_id, total_revenue, timestamp,
lead(timestamp) over (partition by product_id order by timestamp) as next_timestamp
from t
)
select tt.product_id, coalesce(gs.ts, tt.timestamp),
tt.total_revenue
from tt left join lateral
generate_series(timestamp, next_timestamp - interval '15 minute', interval '15 minute') gs(ts);
注意:我的猜测是您还希望将其扩展到 table:
中的最新时间戳
with tt as (
select product_id, total_revenue, timestamp,
lead(timestamp) over (partition by product_id order by timestamp) as next_timestamp,
max(timestamp) over () as max_timestamp
from t
)
select tt.product_id, coalesce(gs.ts, tt.timestamp),
tt.total_revenue
from tt left join lateral
generate_series(timestamp,
coalesce(next_timestamp - interval '15 minute', max_timestamp),
interval '15 minute'
) gs(ts);
此外,如果时间戳不是完全以 15 分钟为间隔,那么我建议您提出一个 新 问题,并提供解释和更真实的示例数据。
我有一个 table 每 15 分钟给我一次数据,我需要那个时间范围。 我注意到有时我没有 3/4 小时的数据,但我需要复制最后一行,但缺少时间戳。
示例:
product_id | total_revenue | timestamp |
---|---|---|
1 | 50 | 01-01-2021 00:00:00 |
2 | 17 | 01-01-2021 00:00:00 |
3 | 30 | 01-01-2021 00:00:00 |
1 | 67 | 01-01-2021 00:15:00 |
2 | 31 | 01-01-2021 00:15:00 |
1 | 67 | 01-01-2021 00:30:00 |
2 | 31 | 01-01-2021 00:30:00 |
3 | 33 | 01-01-2021 00:30:00 |
但我需要这样的输出:
product_id | total_revenue | timestamp |
---|---|---|
1 | 50 | 01-01-2021 00:00:00 |
2 | 17 | 01-01-2021 00:00:00 |
3 | 30 | 01-01-2021 00:00:00 |
1 | 67 | 01-01-2021 00:15:00 |
2 | 31 | 01-01-2021 00:15:00 |
3 | 30 | 01-01-2021 00:15:00 |
1 | 67 | 01-01-2021 00:30:00 |
2 | 31 | 01-01-2021 00:30:00 |
3 | 33 | 01-01-2021 00:30:00 |
我有一个 select 声明,例如:
select product_id,total_revenue,时间戳 来自收入
(我也计算了两个连续行之间的差异)。
有人知道如何帮助我吗?
一种方法使用 generate_series()
和 lead()
:
with tt as (
select product_id, total_revenue, timestamp,
lead(timestamp) over (partition by product_id order by timestamp) as next_timestamp
from t
)
select tt.product_id, coalesce(gs.ts, tt.timestamp),
tt.total_revenue
from tt left join lateral
generate_series(timestamp, next_timestamp - interval '15 minute', interval '15 minute') gs(ts);
注意:我的猜测是您还希望将其扩展到 table:
中的最新时间戳with tt as (
select product_id, total_revenue, timestamp,
lead(timestamp) over (partition by product_id order by timestamp) as next_timestamp,
max(timestamp) over () as max_timestamp
from t
)
select tt.product_id, coalesce(gs.ts, tt.timestamp),
tt.total_revenue
from tt left join lateral
generate_series(timestamp,
coalesce(next_timestamp - interval '15 minute', max_timestamp),
interval '15 minute'
) gs(ts);
此外,如果时间戳不是完全以 15 分钟为间隔,那么我建议您提出一个 新 问题,并提供解释和更真实的示例数据。