SQL 在旋转组模式中产生密集排名

SQL result dense Rank in revolving group pattern

说我有一个table喜欢

store date is_open
Bay 1/1/2022 true
Bay 1/2/2022 true
Bay 1/3/2022 true
Bay 1/4/2022 false
Bay 1/5/2022 false
Bay 1/6/2022 false
Bay 1/7/2022 true
Bay 1/8/2022 true
Bay 1/9/2022 true
Walmart 1/7/2022 true
Walmart 1/8/2022 false
Walmart 1/9/2022 true

我希望他们使用分区依据并获得组的排名,例如

store date is_open group
Bay 1/1/2022 true 1
Bay 1/2/2022 true 1
Bay 1/3/2022 true 1
Bay 1/4/2022 false 2
Bay 1/5/2022 false 2
Bay 1/6/2022 false 2
Bay 1/7/2022 true 3
Bay 1/8/2022 true 3
Bay 1/9/2022 true 3
Walmart 1/7/2022 true 1
Walmart 1/8/2022 false 2
Walmart 1/9/2022 true 3

我开始尝试按 storeis_open 进行分区,但真的很困惑按子句顺序使用什么,我们将不胜感激。

这实际上是一个缺口和孤岛问题。一种方法使用行号差异方法:

WITH cte AS (
    SELECT t.*, ROW_NUMBER() OVER (PARTITION BY store ORDER BY date) rn1,
                ROW_NUMBER() OVER (PARTITION BY store, is_open ORDER BY date) rn2
    FROM yourTable t
),
cte2 AS (
    SELECT t.*, MIN(date) OVER (PARTITION BY store, is_open, rn1 - rn2) AS min_date
    FROM cte t
)

SELECT store, date, is_open,
       DENSE_RANK() OVER (PARTITION BY store ORDER BY rn1 - rn2, min_date) "group"
FROM cte2
ORDER BY store, date;

请注意,我们在这里使用第二个 CTE cte2 来查找每个岛屿的最小日期值。这样做是为了将两个岛与不同的 is_open 值 (true/false) 区分开来,这两个岛恰好在行号上具有相同的差异。它确保在行号差异相同的情况下,首先报告较早的岛。

Demo

您可以使用 LAG() 来检测组的开始。

with cte AS (
    SELECT t.*, case when lag(is_open) OVER (PARTITION BY store ORDER BY date) = is_open then 0 else 1 end sflag
    FROM yourTable t
)
SELECT store, date, is_open, sum(sflag) over(PARTITION BY store ORDER BY date) grp
FROM cte
ORDER BY store, date;

db<>fiddle