添加变量以计算连续月份

Add variable to count consecutive months

我在结合了客户端订阅的 Postgres 数据库中有一个查询。

我想添加一个名为“连续月份”的变量,但我不确定如何在 Postgres 中添加它。

我的原始 table 是这样的:

client product Date
1 Sub 2020-10-01
1 Sub 2020-11-01
2 Sub 2020-11-01
2 Sub 2020-12-01
1 Sub 2021-01-01
1 Sub 2021-02-01
2 Sub 2021-02-01

而且我希望有一些东西可以计算连续几个月的起源,如下所示:

client product Date Consecutive_months
1 Sub 2020-10-01 1
1 Sub 2020-11-01 2
2 Sub 2020-11-01 1
2 Sub 2020-12-01 2
1 Sub 2021-01-01 1
1 Sub 2021-02-01 2
2 Sub 2021-02-01 1

感谢您的帮助!

基于标签 OP 显然意识到这是一个差距和孤岛问题。此查询提取月份和年份信息以生成按月递增的序列。之后只需要使用标准差分逻辑来找到不同步的行并创建离岛标记。

with A as (
    select *,
        date_part('year', dt) * 12 + date_part('month', dt)
          - row_number() over (partition by client, product order by dt) as grp
    from T
)
select *,
    row_number()
        over (partition by client, product, grp order by dt) as consecutive_months
from A;

如果对于给定的客户产品在同一个月内有多个行是可以接受的,那么在两个地方都将 row_number() 切换为 dense_rank()

https://dbfiddle.uk/?rdbms=postgres_9.5&fiddle=397a2f3282cab3b70bd7a47d1dc5ea0a

看来您遇到了 Gaps-And-Islands 类型的问题。

诀窍是根据每个客户的连接日期计算一些排名。

然后根据client和rank可以算出一个序号

select client, product, "Date"
, row_number() over (partition by client, daterank order by "Date") as Consecutive_months
from
(
  select "Date", client, product
  , dense_rank() over (partition by client order by "Date") 
    + (DATE_PART('year', AGE(current_date, "Date"))*12 + 
       DATE_PART('month', AGE(current_date, "Date"))) daterank
from raw t
) q
order by "Date", client
client | product | Date       | consecutive_months
-----: | :------ | :--------- | -----------------:
     1 | Sub     | 2020-10-01 |                  1
     1 | Sub     | 2020-11-01 |                  2
     2 | Sub     | 2020-11-01 |                  1
     2 | Sub     | 2020-12-01 |                  2
     1 | Sub     | 2021-01-01 |                  1
     1 | Sub     | 2021-02-01 |                  2
     2 | Sub     | 2021-02-01 |                  1

db<>fiddle here