合并定期组的日期范围

Merging date ranges for recurring groups

我有一组特征的一组非重叠日期范围。我想要做的是为重复出现的分数生成一个统一的日期范围。

以此为出发点。

ID      START_DATE  END_DATE    SCORE
-------------------------------------
ABCDE   01/04/2014  30/04/2014  OK
ABCDE   01/05/2014  31/05/2014  OK
ABCDE   01/06/2014  30/06/2014  OK
ABCDE   01/07/2014  31/07/2014  OK
ABCDE   01/08/2014  31/08/2014  OK
ABCDE   01/09/2014  30/09/2014  GOOD
ABCDE   01/10/2014  31/10/2014  GOOD
ABCDE   01/11/2014  30/11/2014  GOOD
ABCDE   01/12/2014  31/12/2014  GOOD
ABCDE   01/01/2015  31/01/2015  GOOD
ABCDE   01/02/2015  28/02/2015  GOOD
ABCDE   01/03/2015  31/03/2015  GOOD
ABCDE   01/04/2015  30/04/2015  GOOD
ABCDE   01/05/2015  31/05/2015  GOOD
ABCDE   01/06/2015  30/06/2015  GOOD
ABCDE   01/07/2015  31/07/2015  GOOD
ABCDE   01/08/2015  31/08/2015  GOOD
ABCDE   01/09/2015  30/09/2015  GOOD
ABCDE   01/10/2015  31/10/2015  GOOD
ABCDE   01/11/2015  30/11/2015  GOOD
ABCDE   01/12/2015  31/12/2015  GOOD
ABCDE   01/01/2016  31/01/2016  GOOD
ABCDE   01/02/2016  29/02/2016  GOOD
ABCDE   01/03/2016  31/03/2016  GOOD
ABCDE   01/04/2016  30/04/2016  GOOD
ABCDE   01/05/2016  31/05/2016  GOOD
ABCDE   01/06/2016  30/06/2016  GOOD
ABCDE   01/07/2016  31/07/2016  GOOD
ABCDE   01/08/2016  31/08/2016  GOOD
ABCDE   01/09/2016  30/09/2016  GOOD
ABCDE   01/10/2016  31/10/2016  GOOD
ABCDE   01/11/2016  30/11/2016  GOOD
ABCDE   01/12/2016  31/12/2016  GOOD
ABCDE   01/01/2017  31/01/2017  GOOD
ABCDE   01/02/2017  28/02/2017  GOOD
ABCDE   01/03/2017  31/03/2017  GOOD
ABCDE   01/04/2017  30/04/2017  GOOD
ABCDE   01/05/2017  31/05/2017  GOOD
ABCDE   01/06/2017  30/06/2017  GOOD
ABCDE   01/07/2017  31/07/2017  GOOD
ABCDE   01/08/2017  31/08/2017  GOOD
ABCDE   01/09/2017  30/09/2017  GOOD
ABCDE   01/10/2017  31/10/2017  GOOD
ABCDE   01/11/2017  30/11/2017  GOOD
ABCDE   01/12/2017  31/12/2017  GOOD
ABCDE   01/01/2018  31/01/2018  GOOD
ABCDE   01/02/2018  28/02/2018  GOOD
ABCDE   01/03/2018  31/03/2018  GOOD
ABCDE   01/04/2018  30/04/2018  GOOD
ABCDE   01/05/2018  31/05/2018  GOOD
ABCDE   01/06/2018  30/06/2018  GOOD
ABCDE   01/07/2018  31/07/2018  GOOD
ABCDE   01/08/2018  31/08/2018  BAD
ABCDE   01/09/2018  30/09/2018  BAD
ABCDE   01/10/2018  31/10/2018  GOOD
ABCDE   01/11/2018  30/11/2018  GOOD
ABCDE   01/12/2018  31/12/2018  GOOD
ABCDE   01/01/2019  31/01/2019  GOOD
ABCDE   01/02/2019  28/02/2019  GOOD
ABCDE   01/03/2019  31/03/2019  GOOD

对此

ID      START_DATE  END_DATE    SCORE
-------------------------------------
ABCDE   01/04/2014  31/08/2014  OK
ABCDE   01/09/2014  31/07/2018  GOOD
ABCDE   01/08/2018  30/09/2018  BAD
ABCDE   01/10/2018  31/03/2019  GOOD

这是我迄今为止尝试过的方法,但我无法获得最后的 'GOOD' 分数,因为我使用的是分区函数上的行,该函数仅给出分组 [=22] 的最小值和最大值=].任何帮助将不胜感激。

DROP TABLE #TEST


CREATE TABLE #TEST (
    [ID] [varchar](10) NULL,
    [START_DATE] [DATE] NULL,
    [END_DATE] [date] NULL,
    [SCORE] [varchar](10) NOT NULL
) ON [PRIMARY]
GO


INSERT INTO #TEST

VALUES

('ABCDE','2014-04-01','2014-04-30','OK'),
('ABCDE','2014-05-01','2014-05-31','OK'),
('ABCDE','2014-06-01','2014-06-30','OK'),
('ABCDE','2014-07-01','2014-07-31','OK'),
('ABCDE','2014-08-01','2014-08-31','OK'),
('ABCDE','2014-09-01','2014-09-30','GOOD'),
('ABCDE','2014-10-01','2014-10-31','GOOD'),
('ABCDE','2014-11-01','2014-11-30','GOOD'),
('ABCDE','2014-12-01','2014-12-31','GOOD'),
('ABCDE','2015-01-01','2015-01-31','GOOD'),
('ABCDE','2015-02-01','2015-02-28','GOOD'),
('ABCDE','2015-03-01','2015-03-31','GOOD'),
('ABCDE','2015-04-01','2015-04-30','GOOD'),
('ABCDE','2015-05-01','2015-05-31','GOOD'),
('ABCDE','2015-06-01','2015-06-30','GOOD'),
('ABCDE','2015-07-01','2015-07-31','GOOD'),
('ABCDE','2015-08-01','2015-08-31','GOOD'),
('ABCDE','2015-09-01','2015-09-30','GOOD'),
('ABCDE','2015-10-01','2015-10-31','GOOD'),
('ABCDE','2015-11-01','2015-11-30','GOOD'),
('ABCDE','2015-12-01','2015-12-31','GOOD'),
('ABCDE','2016-01-01','2016-01-31','GOOD'),
('ABCDE','2016-02-01','2016-02-29','GOOD'),
('ABCDE','2016-03-01','2016-03-31','GOOD'),
('ABCDE','2016-04-01','2016-04-30','GOOD'),
('ABCDE','2016-05-01','2016-05-31','GOOD'),
('ABCDE','2016-06-01','2016-06-30','GOOD'),
('ABCDE','2016-07-01','2016-07-31','GOOD'),
('ABCDE','2016-08-01','2016-08-31','GOOD'),
('ABCDE','2016-09-01','2016-09-30','GOOD'),
('ABCDE','2016-10-01','2016-10-31','GOOD'),
('ABCDE','2016-11-01','2016-11-30','GOOD'),
('ABCDE','2016-12-01','2016-12-31','GOOD'),
('ABCDE','2017-01-01','2017-01-31','GOOD'),
('ABCDE','2017-02-01','2017-02-28','GOOD'),
('ABCDE','2017-03-01','2017-03-31','GOOD'),
('ABCDE','2017-04-01','2017-04-30','GOOD'),
('ABCDE','2017-05-01','2017-05-31','GOOD'),
('ABCDE','2017-06-01','2017-06-30','GOOD'),
('ABCDE','2017-07-01','2017-07-31','GOOD'),
('ABCDE','2017-08-01','2017-08-31','GOOD'),
('ABCDE','2017-09-01','2017-09-30','GOOD'),
('ABCDE','2017-10-01','2017-10-31','GOOD'),
('ABCDE','2017-11-01','2017-11-30','GOOD'),
('ABCDE','2017-12-01','2017-12-31','GOOD'),
('ABCDE','2018-01-01','2018-01-31','GOOD'),
('ABCDE','2018-02-01','2018-02-28','GOOD'),
('ABCDE','2018-03-01','2018-03-31','GOOD'),
('ABCDE','2018-04-01','2018-04-30','GOOD'),
('ABCDE','2018-05-01','2018-05-31','GOOD'),
('ABCDE','2018-06-01','2018-06-30','GOOD'),
('ABCDE','2018-07-01','2018-07-31','GOOD'),
('ABCDE','2018-08-01','2018-08-31','BAD'),
('ABCDE','2018-09-01','2018-09-30','BAD'),
('ABCDE','2018-10-01','2018-10-31','GOOD'),
('ABCDE','2018-11-01','2018-11-30','GOOD'),
('ABCDE','2018-12-01','2018-12-31','GOOD'),
('ABCDE','2019-01-01','2019-01-31','GOOD'),
('ABCDE','2019-02-01','2019-02-28','GOOD'),
('ABCDE','2019-03-01','2019-03-31','GOOD')




DROP TABLE #START
SELECT * INTO #START FROM (
SELECT ID
      ,[START_DATE]
      ,[END_DATE]
      ,SCORE
      ,ROW_NUMBER() OVER (PARTITION BY ID,SCORE ORDER BY START_DATE ASC) AS R
  FROM #TEST
  )X

  DROP TABLE #END
  SELECT * INTO #END FROM (
SELECT ID
      ,[START_DATE]
      ,[END_DATE]
      ,SCORE
      ,ROW_NUMBER() OVER (PARTITION BY ID,SCORE ORDER BY START_DATE DESC) AS R
  FROM #TEST
  )X



  SELECT  
  S.ID,
  S.START_DATE,
  E.END_DATE,
  S.SCORE
  FROM #START S
  LEFT JOIN #END E ON E.ID = S.ID AND S.SCORE = E.SCORE

  WHERE S.R=1 AND E.R=1

  ORDER BY 1,2

这是一个 gaps-and-islands 问题。在这种情况下,您正在寻找 重叠。当不存在重叠时,您就有了一个组的开始。累积总和定义组:

select id, score, min(start_date), max(end_date)
from (select t.*,
             sum(case when prev_end_date >= dateadd(day, -1, start_date) then 0 else 1 end) over (partition by id, score order by start_date) as grouping
      from (select t.*,
                   max(end_date) over (partition by id, score
                                       order by start_date
                                       rows between unbounded preceding and 1 preceding
                                      ) as prev_end_date
            from #test t
           ) t
     ) t
group by id, score, grouping;

Here 是一个 db<>fiddle.