ROW_NUMBER 超过 PARTITION BY 在中断之间重新启动行计数器

ROW_NUMBER over PARTITION BY restart row counter between breaks

我有一个当前按用户、activity 的日期和时间以及 ID 排序的活动列表。我想为由相同字段设置的每个组生成数字。使用以下代码,我获得了相当大的准确性。但是,当稍后重复相同的 ID 时会出现问题,我需要重新开始行号计数而不是从上一次迭代继续。

这是我的代码:

ROW_NUMBER() OVER (PARTITION BY USER_ID, foc_id ORDER BY USER_ID, to_char(activity_date, 'MM/DD/YYYY HH24:MI:SS'), foc_id) seq_nbr

在下图中,我们看到 FOC_ID“A240”在 2:20PM 附近有 activity。然后 FOC_ID “B410” 在 3:19PM 附近有 activity,最后用户返回到 “A240” 在 3:20 附近额外 activity。因为“A240”的第一个和第二个事件序列之间有 activity,所以我需要行号 (seq_nbr) 重新开始,而不是从之前的 activity.[=13 继续=]

您可以使用 MATCH_RECOGNIZE:

SELECT user_id,
       activity_date,
       foc_id,
       ROW_NUMBER() OVER ( PARTITION BY user_id, mno ORDER BY activity_date ) AS seq_num
FROM   table_name
MATCH_RECOGNIZE (
  PARTITION BY user_id
  ORDER     BY activity_date
  MEASURES
    MATCH_NUMBER() AS mno
  ALL ROWS PER MATCH
  PATTERN ( same_foc_id* last_row  )
  DEFINE
    same_foc_id AS FIRST( foc_id ) = NEXT( foc_id )
)

或者,多个 ROW_NUMBERs:

SELECT user_id,
       activity_date,
       foc_id,
       ROW_NUMBER() OVER ( PARTITION BY user_id, foc_id, grp ORDER BY activity_date ) AS seq_num
FROM   (
  SELECT user_id,
         activity_date,
         foc_id,
         ROW_NUMBER() OVER ( PARTITION BY user_id ORDER BY activity_date )
           - ROW_NUMBER() OVER ( PARTITION BY user_id, foc_id ORDER BY activity_date ) AS grp
  FROM   table_name
)
ORDER BY user_id, activity_date

其中,对于示例数据:

CREATE TABLE table_name ( user_id, activity_date, foc_id ) AS
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:20:34' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '14:21:23' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:39' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:44' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:19:58' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:20:11' HOUR TO SECOND, 'B410' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:22:16' HOUR TO SECOND, 'A240' FROM DUAL UNION ALL
SELECT 'UVAC3', DATE '2020-11-04' + INTERVAL '15:22:33' HOUR TO SECOND, 'A240' FROM DUAL;

双输出:

USER_ID | ACTIVITY_DATE       | FOC_ID | SEQ_NUM
:------ | :------------------ | :----- | ------:
UVAC3   | 2020-11-04 14:20:34 | A240   |       1
UVAC3   | 2020-11-04 14:21:23 | A240   |       2
UVAC3   | 2020-11-04 14:21:23 | A240   |       3
UVAC3   | 2020-11-04 14:21:23 | A240   |       4
UVAC3   | 2020-11-04 15:19:39 | B410   |       1
UVAC3   | 2020-11-04 15:19:44 | B410   |       2
UVAC3   | 2020-11-04 15:19:58 | B410   |       3
UVAC3   | 2020-11-04 15:20:11 | B410   |       4
UVAC3   | 2020-11-04 15:22:16 | A240   |       1
UVAC3   | 2020-11-04 15:22:33 | A240   |       2

db<>fiddle here