如何在 postgres 上使用开始和结束日期以及季度生成系列

How to generate series using start and end date and quarters on postgres

我有一个如下所示的 table,我想使用开始日期和结束日期将每行的值平均分配给每个季度的 3 个月到开始和结束之间的所有季度结束日期(最后两列)。

我熟悉 Postgres 中的生成序列和间隔,但我很难得到我想要的。

我的 table 和 ID 列将行组合在一起,一个 quarter 列指示 ID 的行引用的四分之一,一个 value 列是整个季度(以及日期范围内每个季度)的值,start_dateend_date 列表示日期范围。这是一个示例:

ID  quarter     value   start_date  end_date 
1   2            152    2019-11-07  2050-12-30
1   1            785    2019-11-07  2050-12-30
2   2            152    2019-03-05  2050-12-30
2   1            785    2019-03-05  2050-12-30
3   4            41     2018-06-12  2050-12-30
3   3            50     2018-06-12  2050-12-30
3   2            88     2018-06-12  2050-12-30
3   1            29     2018-06-12  2050-12-30
4   2           1607    2018-12-17  2050-12-30
4   1           4803    2018-12-17  2050-12-30

这是我想要的输出(ID 1):

ID  quarter     value   start_date  end_date 
1   2            152/3  2020-04-01  2020-07-01
1   1            785/3  2020-01-01  2020-04-01
1   2            152/3  2021-04-01  2021-07-01
1   1            785/3  2021-01-01  2021-04-01

start_date中的输出会先table下个季度。我需要从 start_date 到第一个 table.

end_date 生成系列

您可以通过使用 GENERATE_SERIES 函数并为每个唯一(按 ID)行传入开始和结束日期并将间隔设置为 3 个月来执行此操作。然后在 ID 和四分之一上将结果与原来的 table 合并。

这是一个例子(注意 original_data 是我给你的第一个 table 起的名字):

WITH
quarters_table AS (
    SELECT
        t.ID,
        (EXTRACT('month' FROM t.quarter_date) - 1)::INT / 3 + 1 AS quarter,
        t.quarter_date::DATE AS start_date,
        COALESCE(
            LEAD(t.quarter_date) OVER (),
            DATE_TRUNC('quarter', t.original_end_date) + INTERVAL '3 months'
        )::DATE AS end_date
    FROM (
        SELECT
            original_record.ID,
            original_record.end_date AS original_end_date,
            GENERATE_SERIES(
                DATE_TRUNC('quarter', original_record.start_date),
                DATE_TRUNC('quarter', original_record.end_date),
                INTERVAL '3 months'
            ) AS quarter_date
        FROM (
            SELECT DISTINCT ON (original_data.ID)
                original_data.ID,
                original_data.start_date,
                original_data.end_date
            FROM
                original_data
            ORDER BY
                original_data.ID
        ) AS original_record
    ) AS t
)

SELECT
    quarters_table.ID,
    quarters_table.quarter,
    original_data.value::DOUBLE PRECISION / 3 AS value,
    quarters_table.start_date,
    quarters_table.end_date
FROM
    quarters_table
INNER JOIN
    original_data
ON
    quarters_table.ID = original_data.ID
    AND quarters_table.quarter = original_data.quarter;

示例输出:

 id | quarter |      value       | start_date |  end_date  
----+---------+------------------+------------+------------
  1 |       1 | 261.666666666667 | 2020-01-01 | 2020-04-01
  1 |       2 | 50.6666666666667 | 2020-04-01 | 2020-07-01
  1 |       1 | 261.666666666667 | 2021-01-01 | 2021-04-01
  1 |       2 | 50.6666666666667 | 2021-04-01 | 2021-07-01

为了完整起见,这里是我在测试中使用的 original_data table:

WITH
original_data AS (
    SELECT
        1 AS ID,
        2 AS quarter,
        152 AS value,
        '2019-11-07'::DATE AS start_date,
        '2050-12-30'::DATE AS end_date
    
    UNION ALL

    SELECT
        1 AS ID,
        1 AS quarter,
        785 AS value,
        '2019-11-07'::DATE AS start_date,
        '2050-12-30'::DATE AS end_date
    
    UNION ALL

    SELECT
        2 AS ID,
        2 AS quarter,
        152 AS value,
        '2019-03-05'::DATE AS start_date,
        '2050-12-30'::DATE AS end_date
    
    -- ...
)

这是一种解决方法。根据您概述的输出显示示例。然后,您可以为其他季度向 CASE/WHEN 添加更多条件。

SELECT
    ID,
    Quarter,
    Value/3 AS "Value",
    CASE
     WHEN Quarter = 1 THEN '2020-01-01'
     WHEN Quarter = 2 THEN '2020-04-01'
    END AS "Start_Date",
    CASE
     WHEN Quarter = 1 THEN '2020-04-01'
     WHEN Quarter = 2 THEN '2020-07-01'
    END AS "End_Date"
   FROM
    Table