SQL 服务器：为行分配相同的行号，直到满足条件

Question

我有以下数据集：

    ukey          id          code                    create_date       
     1           1082         9053             2018-03-01 23:18:51.0000000
     2           1082         9035             2018-03-01 23:19:21.0000000
     3           1082         9053             2018-03-01 23:22:55.0000000
     4           1082         9535             2018-03-01 23:23:30.0000000
     5           1196         3145             2018-03-05 07:27:15.0000000
     6           1196         3162             2018-03-05 07:27:50.0000000
     7           1196         3175             2018-03-05 07:28:24.0000000
     8           1196         3235             2018-03-05 07:28:57.0000000
     9           1196         3295             2018-03-05 07:29:31.0000000
     10          1196         3448             2018-03-05 07:30:04.0000000
     11          1196         3465             2018-03-05 07:30:37.0000000
     12          1196         4265             2018-03-05 07:31:09.0000000
     13          1196         3495             2018-03-05 17:13:13.0000000
     14           473          551             2018-03-02 16:43:52.0000000
     15           473          590             2018-03-02 16:44:30.0000000
     16           473          835             2018-03-02 16:45:02.0000000

我想要此数据集中的另一列，比如 RN。 RN 列应具有相同的值，直到满足以下两个条件：

如果id不同或者，
如果当前行和上一行之间的 datediff 超过 60 秒

所以最终的数据集应该是这样的：

    ukey          id          code                    create_date                RN
     1           1082         9053             2018-03-01 23:18:51.0000000        1
     2           1082         9035             2018-03-01 23:19:21.0000000        1
     3           1082         9053             2018-03-01 23:22:55.0000000        1
     4           1082         9535             2018-03-01 23:23:30.0000000        1
     5           1196         3145             2018-03-05 07:27:15.0000000        2
     6           1196         3162             2018-03-05 07:27:50.0000000        2
     7           1196         3175             2018-03-05 07:28:24.0000000        2
     8           1196         3235             2018-03-05 07:28:57.0000000        2
     9           1196         3295             2018-03-05 07:29:31.0000000        2
     10          1196         3448             2018-03-05 07:30:04.0000000        2
     11          1196         3465             2018-03-05 07:30:37.0000000        2
     12          1196         4265             2018-03-05 07:31:09.0000000        2
     13          1196         3495             2018-03-05 17:13:13.0000000        3
     14           473          551             2018-03-02 16:43:52.0000000        4
     15           473          590             2018-03-02 16:44:30.0000000        4
     16           473          835             2018-03-02 16:45:02.0000000        4

注意 RN 从 2 到 3 的变化。id 相同，但 datediff 超过 60 秒。

到目前为止我已经尝试过这个查询。我的原始数据集在 dbo.myTable:

select 
        a.id as A_ID,
        a.code as A_Code,
        a.create_date as A_CreateDate,
        a.RN as A_RN,
        _.id as _ID,
        _.code as _Code,
        _.create_date as _CreateDate,
        _.RN as _RN
    from myTable a
    cross apply
    (
        select * from myTable b
        where a.id=b.id and (a.ukey=b.ukey-1 or a.ukey=1) and datediff(ss,a.create_date,b.create_date)<60
    )_
    order by a.id, a.create_date

这个查询没有给我预期的结果。我的想法是将每一行与前一行进行比较并检查这两个条件。

Answer 1

这听起来像是你想要 lag() 和一个累计总和：

select t.*,
       sum(case when prev_create_date > dateadd(second, -60, create_date)
                then 0 else 1
           end) over
           (order by ukey) as rn
from (select t.*,
             lag(create_date) over (partition by id order by create_date) as prev_create_date
      from t
     ) t;

SQL 服务器：为行分配相同的行号，直到满足条件

SQL Server: assign the same row number to rows until a condition is met

sql

tsql

sql-server

row-number