计算 SQL 中值转换之间的行数

Counting the number of rows between transitions of values in SQL

我有包含 user_id、时间戳和是/否答案的行。我想计算每个 ID 有多少条“NO”的条纹(连续行)。

示例:

user_id timestamp response no_streak
1 2021-01-20 13:59:26 YES 0
1 2021-01-20 14:01:27 NO 1
1 2021-01-20 14:03:21 NO 2
1 2021-01-20 14:07:29 NO 3
1 2021-01-20 14:09:22 YES 0
1 2021-01-20 14:11:26 YES 0
1 2021-01-20 14:13:30 NO 1
1 2021-01-20 14:17:26 NO 2
1 2021-01-20 14:19:29 YES 0
1 2021-01-20 14:25:30 NO 1
1 2021-01-20 14:27:23 NO 2
1 2021-01-20 14:31:23 NO 3
1 2021-01-20 14:35:27 NO 4
1 2021-01-20 14:39:24 YES 0
2 2021-01-20 14:39:24 NO 1
2 2021-01-20 14:47:28 NO 2
2 2021-01-20 14:49:22 NO 3
2 2021-01-20 14:51:25 NO 4
2 2021-01-20 14:53:29 NO 5
2 2021-01-20 14:55:22 NO 6
2 2021-01-20 14:57:22 YES 0

最终我想知道每个用户的连续多长时间:

user_id streak length
1 0
1 3
1 2
1 4
2 0
2 6

我可以使用 LAG() 找到“否”到“是”的过渡位置,反之亦然,但我很难计算每个过渡之间的行数.

计算每行“是”的数量,使相邻的“否”具有相同的分组值。然后过滤聚合:

select t.user_id, count(*), min(timestamp), max(timestamp)
from (select t.*,
             sum(case when response = 'YES' then 1 else 0 end) over (partition by user_id order by timestamp) as grp
      from t
     ) t
where response = 'NO'
group by user_id, grp;

注意:这不是 return 长度 0 的条纹。我不确定“连胜”这个词是否合适。但是要获取它们,请删除 where 过滤器并使用条件聚合:

select t.user_id, sum(case when response = 'NO' then 1 else 0 end),
       min(timestamp), max(timestamp)
from (select t.*,
             sum(case when response = 'YES' then 1 else 0 end) over (partition by user_id order by timestamp) as grp
      from t
     ) t
group by user_id, grp;