生成包括下一行值的日期系列
Generate date series including next row value
我有一个table:
╔════════════╦════════╦════════════╗
║ product_id ║ amount ║ date ║
╠════════════╬════════╬════════════╣
║ 1 ║ 100 ║ 2019-01-01 ║
║ 2 ║ 150 ║ 2019-01-01 ║
║ 1 ║ 200 ║ 2019-01-05 ║
║ 2 ║ 180 ║ 2019-01-03 ║
║ 2 ║ 150 ║ 2019-01-05 ║
╚════════════╩════════╩════════════╝
我需要根据下一行的值(金额)生成产品行。我需要这样的结果:
╔════════════╦════════╦════════════╗
║ product_id ║ amount ║ date ║
╠════════════╬════════╬════════════╣
║ 1 ║ 100 ║ 2019-01-01 ║
║ 1 ║ 100 ║ 2019-01-02 ║
║ 1 ║ 100 ║ 2019-01-03 ║
║ 1 ║ 100 ║ 2019-01-04 ║
║ 1 ║ 200 ║ 2019-01-05 ║
║ 2 ║ 150 ║ 2019-01-01 ║
║ 2 ║ 150 ║ 2019-01-02 ║
║ 2 ║ 180 ║ 2019-01-03 ║
║ 2 ║ 180 ║ 2019-01-04 ║
║ 2 ║ 150 ║ 2019-01-05 ║
╚════════════╩════════╩════════════╝
您可以在聚合子查询中使用 generate_series()
来生成 "missing" 日期。
然后,我们需要将前面的非空数量带入新行 - 如果 Postgres 支持 lag()
的 ignore nulls
选项,这将是直截了当的 - 但它不支持.解决此问题的一种方法是使用 window 计数来定义组,然后使用 first
value()`:
select
product_id,
dt,
first_value(amount) over(partition by product_id, grp order by dt) amount
from (
select
x.*,
t.amount,
count(*) filter(where t.amount is not null)
over(partition by x.product_id order by x.dt) grp
from (
select product_id, generate_series(min(date), max(date), '1 day'::interval) dt
from mytable
group by product_id
) x
left join mytable t on t.product_id = x.product_id and t.date = x.dt
) t
order by product_id, dt
product_id | dt | amount
---------: | :--------------------- | -----:
1 | 2019-01-01 00:00:00+00 | 100
1 | 2019-01-02 00:00:00+00 | 100
1 | 2019-01-03 00:00:00+00 | 100
1 | 2019-01-04 00:00:00+00 | 100
1 | 2019-01-05 00:00:00+00 | 200
2 | 2019-01-01 00:00:00+00 | 150
2 | 2019-01-02 00:00:00+00 | 150
2 | 2019-01-03 00:00:00+00 | 180
2 | 2019-01-04 00:00:00+00 | 180
2 | 2019-01-05 00:00:00+00 | 150
我有一个table:
╔════════════╦════════╦════════════╗
║ product_id ║ amount ║ date ║
╠════════════╬════════╬════════════╣
║ 1 ║ 100 ║ 2019-01-01 ║
║ 2 ║ 150 ║ 2019-01-01 ║
║ 1 ║ 200 ║ 2019-01-05 ║
║ 2 ║ 180 ║ 2019-01-03 ║
║ 2 ║ 150 ║ 2019-01-05 ║
╚════════════╩════════╩════════════╝
我需要根据下一行的值(金额)生成产品行。我需要这样的结果:
╔════════════╦════════╦════════════╗
║ product_id ║ amount ║ date ║
╠════════════╬════════╬════════════╣
║ 1 ║ 100 ║ 2019-01-01 ║
║ 1 ║ 100 ║ 2019-01-02 ║
║ 1 ║ 100 ║ 2019-01-03 ║
║ 1 ║ 100 ║ 2019-01-04 ║
║ 1 ║ 200 ║ 2019-01-05 ║
║ 2 ║ 150 ║ 2019-01-01 ║
║ 2 ║ 150 ║ 2019-01-02 ║
║ 2 ║ 180 ║ 2019-01-03 ║
║ 2 ║ 180 ║ 2019-01-04 ║
║ 2 ║ 150 ║ 2019-01-05 ║
╚════════════╩════════╩════════════╝
您可以在聚合子查询中使用 generate_series()
来生成 "missing" 日期。
然后,我们需要将前面的非空数量带入新行 - 如果 Postgres 支持 lag()
的 ignore nulls
选项,这将是直截了当的 - 但它不支持.解决此问题的一种方法是使用 window 计数来定义组,然后使用 first
value()`:
select
product_id,
dt,
first_value(amount) over(partition by product_id, grp order by dt) amount
from (
select
x.*,
t.amount,
count(*) filter(where t.amount is not null)
over(partition by x.product_id order by x.dt) grp
from (
select product_id, generate_series(min(date), max(date), '1 day'::interval) dt
from mytable
group by product_id
) x
left join mytable t on t.product_id = x.product_id and t.date = x.dt
) t
order by product_id, dt
product_id | dt | amount ---------: | :--------------------- | -----: 1 | 2019-01-01 00:00:00+00 | 100 1 | 2019-01-02 00:00:00+00 | 100 1 | 2019-01-03 00:00:00+00 | 100 1 | 2019-01-04 00:00:00+00 | 100 1 | 2019-01-05 00:00:00+00 | 200 2 | 2019-01-01 00:00:00+00 | 150 2 | 2019-01-02 00:00:00+00 | 150 2 | 2019-01-03 00:00:00+00 | 180 2 | 2019-01-04 00:00:00+00 | 180 2 | 2019-01-05 00:00:00+00 | 150