基于日期列的开始日期结束日期计算

Start Date end date calculation based on date column

我正在尝试根据 table 中的日期列计算 StartDate 和 EndDate。 下面是源码table 长得像

场景一

ID SERIAL_NUMBER STATUS READ_DT
123456789 42007 D 15-12-2021
123456789 42007 D 16-12-2021
123456789 42007 D 17-12-2021
123456789 42007 D 18-12-2021
123456789 42007 D 19-12-2021
123456789 42007 D 20-12-2021
123456789 42007 D 21-12-2021

我想根据 READ_DT 计算 start_date 和 end_date,对于 ID 和 SERIAL_NUMBER 如果所有 READ_DT 都可用,那么输出应该如下

ID SERIAL_NUMBER STATUS Start_Date End_Date
123456789 42007 D 15-12-2021 21-12-2021

场景 2

ID SERIAL_NUMBER STATUS READ_DT
123456789 42007 D 15-12-2021
123456789 42007 D 16-12-2021
123456789 42007 D 17-12-2021
123456789 42007 D 19-12-2021
123456789 42007 D 20-12-2021
123456789 42007 D 21-12-2021

如果 READ_DT 之间存在任何差距,则预期输出应该在以下两个事务中。

ID SERIAL_NUMBER STATUS Start_Date End_Date
123456789 42007 D 15-12-2021 17-12-2021
123456789 42007 D 19-12-2021 21-12-2021

对于场景 1,您可以直接使用聚合最小和最大函数按剩余列分组。

select ID,SERIAL_NUMBER, STATUS, convert(varchar, min(READ_DT), 105) as Start_Date, convert(varchar, max(READ_DT), 105) as End_Date 
from tb1
group by ID,SERIAL_NUMBER, STATUS

对于场景2,我使用LAG函数获取当前行与上一行的日期差异,然后进行聚合。

此代码适用于场景 1 和 2 数据。

代码:

   drop table if exists #t1 
  --stores diff_days and missing date from sequence
   SELECT READ_DT,
     case when DATEDIFF(day, LAG(READ_DT) OVER (ORDER BY READ_DT), READ_DT ) is NULL then 1 
     else DATEDIFF(day, LAG(READ_DT) OVER (ORDER BY READ_DT), READ_DT )
     end AS diff_day
    ,case when DATEDIFF(day, LAG(READ_DT) OVER (ORDER BY READ_DT), READ_DT ) >1 then DATEADD(day, -1, READ_DT)
     end as diff_read_dt
    into #t1
    from tb2

   --update diff_day column where date greater that missing date to aggregate on the result set
   update #t1
   set diff_day = diff_day+1
   where convert(date,READ_DT) > (select dateadd(day,1,convert(date,diff_read_dt)) from #t1 where diff_read_dt is not null)

  --get the required results using min and max

   select a.ID, a.SERIAL_NUMBER, a.STATUS, convert(varchar, min(a.READ_DT), 105) as Start_Date, convert(varchar, max(a.READ_DT), 105) as End_Date 
   from tb2 a
   inner join #t1 b on convert(date,a.READ_DT) = convert(date,b.READ_DT)
   group by a.ID, a.SERIAL_NUMBER, a.STATUS, b.diff_day

一点顺序时间数学就可以简化这些事情。

--===== This will work for either scenario
   WITH cteDTgrp AS
(--==== Subtract an increasing number of days from each date to create the date groups.
 SELECT *
        ,DT_Grp = DATEADD(dd,-ROW_NUMBER() OVER (PARTITION BY ID,SERIAL_NUMBER,STATUS ORDER BY READ_DT),READ_DT)
   FROM dbo.YourTableNameHere
)--==== Then the grouping to get the start and end dates is trivial.
 SELECT ID,SERIAL_NUMBER,STATUS
        ,Start_Date = MIN(READ_DT)
        ,End_Date   = MAX(READ_DT)
   FROM cteDTgrp
  GROUP BY ID,SERIAL_NUMBER,STATUS,DT_Grp --<----This is the key!
  ORDER BY ID,SERIAL_NUMBER,STATUS,Start_Date
;

请注意,这仅在 READ_DT 对于每组 ID 是唯一的时才有效,SERIAL_NUMBER,STATUS。