获取不同子组中的值计数

Get count of values in different subgroups

我需要删除数据集中的一些行,其中 speed 等于零并且持续超过 N 次(假设 N 为 2)。 table demo 的结构如下:

id car speed time
1 foo 0 1
2 foo 0 2
3 foo 0 3
4 foo 1 4
5 foo 1 5
6 foo 0 6
7 bar 0 1
8 bar 0 2
9 bar 5 3
10 bar 5 4
11 bar 5 5
12 bar 5 6

然后我希望通过使用window_function:

生成一个像下面这样的table
id car speed time lasting
1 foo 0 1 3
2 foo 0 2 3
3 foo 0 3 3
4 foo 1 4 2
5 foo 1 5 2
6 foo 0 6 1
7 bar 0 1 2
8 bar 0 2 2
9 bar 5 3 4
10 bar 5 4 4
11 bar 5 5 4
12 bar 5 6 4

然后我可以使用 WHERE NOT (speed = 0 AND lasting > 2)

轻松排除这些行

将我试过的代码放在这里,但它没有 return 我预期的值,我猜那些 FROM (SELECT ... FROM (SELECT ... 可能不是解决问题的最佳实践:

SELECT g3.*, count(id) OVER (PARTITION BY car, cumsum ORDER BY id) as num   
  FROM (SELECT g2.*, sum(grp2) OVER (PARTITION BY car ORDER BY id) AS cumsum             
    FROM (SELECT g1.*, (CASE ne0 WHEN 0 THEN 0 ELSE 1 END) AS grp2                            
      FROM (SELECT g.*, speed - lag(speed, 1, 0) OVER (PARTITION BY car) AS ne0              
        FROM (SELECT *, row_number() OVER (PARTITION BY car) AS grp FROM demo) g ) g1 ) g2 ) g3                                                                                       
ORDER BY id;

您可以使用 window 函数 LAG() 检查每行的前一个 speed 值,并使用 SUM() window 函数创建组连续值。
然后用COUNT()window函数可以统计每组的行数,这样就可以过滤掉超过2行的组中0speed的行:

SELECT id, car, speed, time
FROM (
  SELECT *, COUNT(*) OVER (PARTITION BY car, grp) counter
  FROM (
    SELECT *, SUM(flag::int) OVER (PARTITION BY car ORDER BY time) grp
    FROM (
      SELECT *, speed <> LAG(speed, 1, speed - 1) OVER (PARTITION BY car ORDER BY time) flag
      FROM demo
    ) t  
  ) t
) t
WHERE speed <> 0 OR counter <= 2
ORDER BY id;

参见demo