SQL 只筛选连续的数字

SQL Filter to only consecutive numbers

我有一个 table 是由 timestamp 订购的,我只想保留连续的 step 个订单(下面标有 *)。
在命令式编程中,它将是:

prev_step = 0
output = []
for step in table.steps:  # already sorted by timestamp
  if step == prev_step + 1:
    output.append(step)  # desired row
    prev_step = step

我拥有的原始 table(用 * 装饰的所需行,实际上不是数据):

| timestamp | step |
| --------- | ---- |
| 100000001 | 5    |
| 100000002 | 1    |*
| 100000003 | 1    | ^
| 100000004 | 2    |*
| 100000005 | 2    | ^
| 100000006 | 4    |
| 100000007 | 5    |
| 100000008 | 3    |*
| 100000009 | 4    |*
| 100000010 | 2    |
| 100000011 | 5    |*
| 100000012 | 7    |

我想要的:

| timestamp | step |
| --------- | ---- |
| 100000002 | 1    |*
| 100000004 | 2    |*
| 100000008 | 3    |*
| 100000009 | 4    |*
| 100000011 | 5    |*

我只想出了一个 WHERE step - LAG(step) OVER (ORDER BY timestamp) <> 0,但它只会删除相邻的重复项(在上面的 ^ 中标记)。它当然有帮助,但还不够。

提前致谢!

这是一个解决方案,它依赖于相关子查询来检测要在每个步骤中保留的正确记录。

WITH cte AS (
    SELECT t1.*, (SELECT COUNT(DISTINCT t2.step) FROM yourTable t2
                 WHERE t2."timestamp" < t1."timestamp" AND t2.step < t1.step) AS cnt
    FROM yourTable t1
),
cte2 AS (
    SELECT t.*, ROW_NUMBER() OVER (PARTITION BY cnt ORDER BY step, "timestamp") rn
    FROM cte t
)

SELECT t1."timestamp", t1.step, t1.rn, t1.cnt
FROM cte2 t1
WHERE rn = 1 AND (step = 1 OR EXISTS (SELECT 1 FROM yourTable t2
                                      WHERE t2.step = t1.step - 1))
ORDER BY "timestamp";

Demo

一种方法是递归 CTE。不幸的是,递归 CTE 是有限制的。因此,一种方法是按步进顺序生成通过数据的每条路径。然后为每一步选择最小的时间戳:

with cte(ts, step) as (
      (select ts, step
      from t
      where step = 1
      order by ts
      fetch first 1 row only)
      union all
      select t.ts, t.step
      from cte join
           t
           on t.ts >= cte.ts and t.step = cte.step + 1
     )
select *
from (select cte.*,
             row_number() over (partition by step order by ts) as seqnum
      from cte
     ) cte
where seqnum = 1;

Here 是一个 db<>fiddle.