按名称分组的连续日期范围内的最小和最大日期
Minimum and maximum dates within continuous date range grouped by name
我有一个包含一个人的开始日期和结束日期的数据范围,我只想获得每个人的连续日期范围:
输入:
NAME | STARTDATE | END DATE
--------------------------------------
MIKE | **2019-05-15** | 2019-05-16
MIKE | 2019-05-17 | **2019-05-18**
MIKE | 2020-05-18 | 2020-05-19
预期输出如下:
MIKE | **2019-05-15** | **2019-05-18**
MIKE | 2020-05-18 | 2020-05-19
所以对于这个人的每个连续周期,基本上输出是 MIN 和 MAX。
感谢任何帮助。
我试过以下查询:
With N AS ( SELECT Name, StartDate, EndDate
, LastStop = MAX(EndDate)
OVER (PARTITION BY Name ORDER BY StartDate, EndDate
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) FROM Table ), B AS ( SELECT Name, StartDate, EndDate
, Block = SUM(CASE WHEN LastStop Is Null Then 1
WHEN LastStop < StartDate Then 1
ELSE 0
END)
OVER (PARTITION BY Name ORDER BY StartDate, LastStop) FROM N ) SELECT Name
, MIN(StartDate) DateFrom
, MAX(EndDate) DateTo FROM B GROUP BY Name, Block ORDER BY Name, Block
但是没有考虑连续周期。它显示相同的输入。
这是一个使用临时计数的示例 table
示例或dbFiddle
;with cte as (
Select A.[Name]
,B.D
,Grp = datediff(day,'1900-01-01',D) - dense_rank() over (partition by [Name] Order by D)
From YourTable A
Cross Apply (
Select Top (DateDiff(DAY,StartDate,EndDate)+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),StartDate)
From master..spt_values n1,master..spt_values n2
) B
)
Select [Name]
,StartDate= min(D)
,EndDate = max(D)
From cte
Group By [Name],Grp
Returns
Name StartDate EndDate
MIKE 2019-05-15 2019-05-18
MIKE 2020-05-18 2020-05-19
只是为了帮助可视化,CTE 生成以下内容
这会给你相同的结果
SELECT subquery.name,min(subquery.startdate),max(subquery.enddate1)
FROM (SELECT NAME,startdate,
CASE WHEN EXISTS(SELECT yt1.startdate
FROM t yt1
WHERE yt1.startdate = DATEADD(day, 1, yt2.enddate)
) THEN null else yt2.enddate END as enddate1
FROM t yt2) as subquery
GROUP by NAME, CAST(MONTH(subquery.startdate) AS VARCHAR(2)) + '-' + CAST(YEAR(subquery.startdate) AS VARCHAR(4))
对于CASE WHEN EXISTS
我参考了SQL CASE
按年月分组可以看到这个GROUP BY MONTH AND YEAR
这是一种间隙岛问题。无需按天扩展数据!这看起来效率很低。
相反,确定 "islands"。这是没有重叠的地方——在你的情况下 lag()
就足够了。然后是累加和聚合:
select name, min(startdate), max(enddate)
from (select t.*,
sum(case when prev_enddate >= dateadd(day, -1, startdate) then 0 else 1 end) over
(partition by name order by startdate) as grp
from (select t.*,
lag(enddate) over (partition by name order by startdate) as prev_enddate
from t
) t
) t
group by name, grp;
Here 是一个 db<>fiddle.
我有一个包含一个人的开始日期和结束日期的数据范围,我只想获得每个人的连续日期范围:
输入:
NAME | STARTDATE | END DATE
--------------------------------------
MIKE | **2019-05-15** | 2019-05-16
MIKE | 2019-05-17 | **2019-05-18**
MIKE | 2020-05-18 | 2020-05-19
预期输出如下:
MIKE | **2019-05-15** | **2019-05-18**
MIKE | 2020-05-18 | 2020-05-19
所以对于这个人的每个连续周期,基本上输出是 MIN 和 MAX。
感谢任何帮助。
我试过以下查询:
With N AS ( SELECT Name, StartDate, EndDate
, LastStop = MAX(EndDate)
OVER (PARTITION BY Name ORDER BY StartDate, EndDate
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) FROM Table ), B AS ( SELECT Name, StartDate, EndDate
, Block = SUM(CASE WHEN LastStop Is Null Then 1
WHEN LastStop < StartDate Then 1
ELSE 0
END)
OVER (PARTITION BY Name ORDER BY StartDate, LastStop) FROM N ) SELECT Name
, MIN(StartDate) DateFrom
, MAX(EndDate) DateTo FROM B GROUP BY Name, Block ORDER BY Name, Block
但是没有考虑连续周期。它显示相同的输入。
这是一个使用临时计数的示例 table
示例或dbFiddle
;with cte as (
Select A.[Name]
,B.D
,Grp = datediff(day,'1900-01-01',D) - dense_rank() over (partition by [Name] Order by D)
From YourTable A
Cross Apply (
Select Top (DateDiff(DAY,StartDate,EndDate)+1) D=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),StartDate)
From master..spt_values n1,master..spt_values n2
) B
)
Select [Name]
,StartDate= min(D)
,EndDate = max(D)
From cte
Group By [Name],Grp
Returns
Name StartDate EndDate
MIKE 2019-05-15 2019-05-18
MIKE 2020-05-18 2020-05-19
只是为了帮助可视化,CTE 生成以下内容
这会给你相同的结果
SELECT subquery.name,min(subquery.startdate),max(subquery.enddate1)
FROM (SELECT NAME,startdate,
CASE WHEN EXISTS(SELECT yt1.startdate
FROM t yt1
WHERE yt1.startdate = DATEADD(day, 1, yt2.enddate)
) THEN null else yt2.enddate END as enddate1
FROM t yt2) as subquery
GROUP by NAME, CAST(MONTH(subquery.startdate) AS VARCHAR(2)) + '-' + CAST(YEAR(subquery.startdate) AS VARCHAR(4))
对于CASE WHEN EXISTS
我参考了SQL CASE
按年月分组可以看到这个GROUP BY MONTH AND YEAR
这是一种间隙岛问题。无需按天扩展数据!这看起来效率很低。
相反,确定 "islands"。这是没有重叠的地方——在你的情况下 lag()
就足够了。然后是累加和聚合:
select name, min(startdate), max(enddate)
from (select t.*,
sum(case when prev_enddate >= dateadd(day, -1, startdate) then 0 else 1 end) over
(partition by name order by startdate) as grp
from (select t.*,
lag(enddate) over (partition by name order by startdate) as prev_enddate
from t
) t
) t
group by name, grp;
Here 是一个 db<>fiddle.