按名称分组的连续日期范围内的最小和最大日期

Minimum and maximum dates within continuous date range grouped by name

我有一个包含一个人的开始日期和结束日期的数据范围,我只想获得每个人的连续日期范围:

输入:

NAME | STARTDATE      | END DATE
--------------------------------------
MIKE | **2019-05-15** | 2019-05-16 
MIKE | 2019-05-17     | **2019-05-18**
MIKE | 2020-05-18     | 2020-05-19

预期输出如下:

MIKE | **2019-05-15** | **2019-05-18** 
MIKE | 2020-05-18     | 2020-05-19

所以对于这个人的每个连续周期,基本上输出是 MIN 和 MAX。

感谢任何帮助。

我试过以下查询:

With N AS (   SELECT Name, StartDate, EndDate
       , LastStop = MAX(EndDate) 
                    OVER (PARTITION BY Name ORDER BY StartDate, EndDate 
                          ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)   FROM   Table  ), B AS (   SELECT Name, StartDate, EndDate
       , Block = SUM(CASE WHEN LastStop Is Null Then 1
                          WHEN LastStop < StartDate Then 1
                          ELSE 0
                    END)
                 OVER (PARTITION BY Name ORDER BY StartDate, LastStop)   FROM   N ) SELECT Name
     , MIN(StartDate) DateFrom
     , MAX(EndDate) DateTo FROM   B GROUP BY Name, Block ORDER BY Name, Block

但是没有考虑连续周期。它显示相同的输入。

这是一个使用临时计数的示例 table

示例或dbFiddle

;with cte as (
Select A.[Name]
      ,B.D
      ,Grp  = datediff(day,'1900-01-01',D) - dense_rank() over (partition by [Name] Order by D)
 From  YourTable A
 Cross Apply ( 
                Select Top (DateDiff(DAY,StartDate,EndDate)+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),StartDate) 
                 From  master..spt_values n1,master..spt_values n2 
             ) B

)
Select [Name]
      ,StartDate= min(D)
      ,EndDate  = max(D)
 From  cte
 Group By [Name],Grp

Returns

Name    StartDate   EndDate
MIKE    2019-05-15  2019-05-18
MIKE    2020-05-18  2020-05-19

只是为了帮助可视化,CTE 生成以下内容

这会给你相同的结果

    SELECT subquery.name,min(subquery.startdate),max(subquery.enddate1)
FROM (SELECT NAME,startdate,
      CASE WHEN EXISTS(SELECT yt1.startdate 
                       FROM t yt1 
                       WHERE yt1.startdate = DATEADD(day, 1, yt2.enddate) 
                       ) THEN null else yt2.enddate END as enddate1
      FROM t yt2) as subquery
GROUP by NAME, CAST(MONTH(subquery.startdate) AS VARCHAR(2)) + '-' + CAST(YEAR(subquery.startdate) AS VARCHAR(4))

对于CASE WHEN EXISTS我参考了SQL CASE

按年月分组可以看到这个GROUP BY MONTH AND YEAR

DB_FIDDLE

这是一种间隙岛问题。无需按天扩展数据!这看起来效率很低。

相反,确定 "islands"。这是没有重叠的地方——在你的情况下 lag() 就足够了。然后是累加和聚合:

select name, min(startdate), max(enddate)
from (select t.*,
             sum(case when prev_enddate >= dateadd(day, -1, startdate) then 0 else 1 end) over 
                 (partition by name order by startdate) as grp
      from (select t.*,
                   lag(enddate) over (partition by name order by startdate) as prev_enddate
            from t
           ) t
     ) t
group by name, grp;

Here 是一个 db<>fiddle.