合并定期组的日期范围
Merging date ranges for recurring groups
我有一组特征的一组非重叠日期范围。我想要做的是为重复出现的分数生成一个统一的日期范围。
以此为出发点。
ID START_DATE END_DATE SCORE
-------------------------------------
ABCDE 01/04/2014 30/04/2014 OK
ABCDE 01/05/2014 31/05/2014 OK
ABCDE 01/06/2014 30/06/2014 OK
ABCDE 01/07/2014 31/07/2014 OK
ABCDE 01/08/2014 31/08/2014 OK
ABCDE 01/09/2014 30/09/2014 GOOD
ABCDE 01/10/2014 31/10/2014 GOOD
ABCDE 01/11/2014 30/11/2014 GOOD
ABCDE 01/12/2014 31/12/2014 GOOD
ABCDE 01/01/2015 31/01/2015 GOOD
ABCDE 01/02/2015 28/02/2015 GOOD
ABCDE 01/03/2015 31/03/2015 GOOD
ABCDE 01/04/2015 30/04/2015 GOOD
ABCDE 01/05/2015 31/05/2015 GOOD
ABCDE 01/06/2015 30/06/2015 GOOD
ABCDE 01/07/2015 31/07/2015 GOOD
ABCDE 01/08/2015 31/08/2015 GOOD
ABCDE 01/09/2015 30/09/2015 GOOD
ABCDE 01/10/2015 31/10/2015 GOOD
ABCDE 01/11/2015 30/11/2015 GOOD
ABCDE 01/12/2015 31/12/2015 GOOD
ABCDE 01/01/2016 31/01/2016 GOOD
ABCDE 01/02/2016 29/02/2016 GOOD
ABCDE 01/03/2016 31/03/2016 GOOD
ABCDE 01/04/2016 30/04/2016 GOOD
ABCDE 01/05/2016 31/05/2016 GOOD
ABCDE 01/06/2016 30/06/2016 GOOD
ABCDE 01/07/2016 31/07/2016 GOOD
ABCDE 01/08/2016 31/08/2016 GOOD
ABCDE 01/09/2016 30/09/2016 GOOD
ABCDE 01/10/2016 31/10/2016 GOOD
ABCDE 01/11/2016 30/11/2016 GOOD
ABCDE 01/12/2016 31/12/2016 GOOD
ABCDE 01/01/2017 31/01/2017 GOOD
ABCDE 01/02/2017 28/02/2017 GOOD
ABCDE 01/03/2017 31/03/2017 GOOD
ABCDE 01/04/2017 30/04/2017 GOOD
ABCDE 01/05/2017 31/05/2017 GOOD
ABCDE 01/06/2017 30/06/2017 GOOD
ABCDE 01/07/2017 31/07/2017 GOOD
ABCDE 01/08/2017 31/08/2017 GOOD
ABCDE 01/09/2017 30/09/2017 GOOD
ABCDE 01/10/2017 31/10/2017 GOOD
ABCDE 01/11/2017 30/11/2017 GOOD
ABCDE 01/12/2017 31/12/2017 GOOD
ABCDE 01/01/2018 31/01/2018 GOOD
ABCDE 01/02/2018 28/02/2018 GOOD
ABCDE 01/03/2018 31/03/2018 GOOD
ABCDE 01/04/2018 30/04/2018 GOOD
ABCDE 01/05/2018 31/05/2018 GOOD
ABCDE 01/06/2018 30/06/2018 GOOD
ABCDE 01/07/2018 31/07/2018 GOOD
ABCDE 01/08/2018 31/08/2018 BAD
ABCDE 01/09/2018 30/09/2018 BAD
ABCDE 01/10/2018 31/10/2018 GOOD
ABCDE 01/11/2018 30/11/2018 GOOD
ABCDE 01/12/2018 31/12/2018 GOOD
ABCDE 01/01/2019 31/01/2019 GOOD
ABCDE 01/02/2019 28/02/2019 GOOD
ABCDE 01/03/2019 31/03/2019 GOOD
对此
ID START_DATE END_DATE SCORE
-------------------------------------
ABCDE 01/04/2014 31/08/2014 OK
ABCDE 01/09/2014 31/07/2018 GOOD
ABCDE 01/08/2018 30/09/2018 BAD
ABCDE 01/10/2018 31/03/2019 GOOD
这是我迄今为止尝试过的方法,但我无法获得最后的 'GOOD' 分数,因为我使用的是分区函数上的行,该函数仅给出分组 [=22] 的最小值和最大值=].任何帮助将不胜感激。
DROP TABLE #TEST
CREATE TABLE #TEST (
[ID] [varchar](10) NULL,
[START_DATE] [DATE] NULL,
[END_DATE] [date] NULL,
[SCORE] [varchar](10) NOT NULL
) ON [PRIMARY]
GO
INSERT INTO #TEST
VALUES
('ABCDE','2014-04-01','2014-04-30','OK'),
('ABCDE','2014-05-01','2014-05-31','OK'),
('ABCDE','2014-06-01','2014-06-30','OK'),
('ABCDE','2014-07-01','2014-07-31','OK'),
('ABCDE','2014-08-01','2014-08-31','OK'),
('ABCDE','2014-09-01','2014-09-30','GOOD'),
('ABCDE','2014-10-01','2014-10-31','GOOD'),
('ABCDE','2014-11-01','2014-11-30','GOOD'),
('ABCDE','2014-12-01','2014-12-31','GOOD'),
('ABCDE','2015-01-01','2015-01-31','GOOD'),
('ABCDE','2015-02-01','2015-02-28','GOOD'),
('ABCDE','2015-03-01','2015-03-31','GOOD'),
('ABCDE','2015-04-01','2015-04-30','GOOD'),
('ABCDE','2015-05-01','2015-05-31','GOOD'),
('ABCDE','2015-06-01','2015-06-30','GOOD'),
('ABCDE','2015-07-01','2015-07-31','GOOD'),
('ABCDE','2015-08-01','2015-08-31','GOOD'),
('ABCDE','2015-09-01','2015-09-30','GOOD'),
('ABCDE','2015-10-01','2015-10-31','GOOD'),
('ABCDE','2015-11-01','2015-11-30','GOOD'),
('ABCDE','2015-12-01','2015-12-31','GOOD'),
('ABCDE','2016-01-01','2016-01-31','GOOD'),
('ABCDE','2016-02-01','2016-02-29','GOOD'),
('ABCDE','2016-03-01','2016-03-31','GOOD'),
('ABCDE','2016-04-01','2016-04-30','GOOD'),
('ABCDE','2016-05-01','2016-05-31','GOOD'),
('ABCDE','2016-06-01','2016-06-30','GOOD'),
('ABCDE','2016-07-01','2016-07-31','GOOD'),
('ABCDE','2016-08-01','2016-08-31','GOOD'),
('ABCDE','2016-09-01','2016-09-30','GOOD'),
('ABCDE','2016-10-01','2016-10-31','GOOD'),
('ABCDE','2016-11-01','2016-11-30','GOOD'),
('ABCDE','2016-12-01','2016-12-31','GOOD'),
('ABCDE','2017-01-01','2017-01-31','GOOD'),
('ABCDE','2017-02-01','2017-02-28','GOOD'),
('ABCDE','2017-03-01','2017-03-31','GOOD'),
('ABCDE','2017-04-01','2017-04-30','GOOD'),
('ABCDE','2017-05-01','2017-05-31','GOOD'),
('ABCDE','2017-06-01','2017-06-30','GOOD'),
('ABCDE','2017-07-01','2017-07-31','GOOD'),
('ABCDE','2017-08-01','2017-08-31','GOOD'),
('ABCDE','2017-09-01','2017-09-30','GOOD'),
('ABCDE','2017-10-01','2017-10-31','GOOD'),
('ABCDE','2017-11-01','2017-11-30','GOOD'),
('ABCDE','2017-12-01','2017-12-31','GOOD'),
('ABCDE','2018-01-01','2018-01-31','GOOD'),
('ABCDE','2018-02-01','2018-02-28','GOOD'),
('ABCDE','2018-03-01','2018-03-31','GOOD'),
('ABCDE','2018-04-01','2018-04-30','GOOD'),
('ABCDE','2018-05-01','2018-05-31','GOOD'),
('ABCDE','2018-06-01','2018-06-30','GOOD'),
('ABCDE','2018-07-01','2018-07-31','GOOD'),
('ABCDE','2018-08-01','2018-08-31','BAD'),
('ABCDE','2018-09-01','2018-09-30','BAD'),
('ABCDE','2018-10-01','2018-10-31','GOOD'),
('ABCDE','2018-11-01','2018-11-30','GOOD'),
('ABCDE','2018-12-01','2018-12-31','GOOD'),
('ABCDE','2019-01-01','2019-01-31','GOOD'),
('ABCDE','2019-02-01','2019-02-28','GOOD'),
('ABCDE','2019-03-01','2019-03-31','GOOD')
DROP TABLE #START
SELECT * INTO #START FROM (
SELECT ID
,[START_DATE]
,[END_DATE]
,SCORE
,ROW_NUMBER() OVER (PARTITION BY ID,SCORE ORDER BY START_DATE ASC) AS R
FROM #TEST
)X
DROP TABLE #END
SELECT * INTO #END FROM (
SELECT ID
,[START_DATE]
,[END_DATE]
,SCORE
,ROW_NUMBER() OVER (PARTITION BY ID,SCORE ORDER BY START_DATE DESC) AS R
FROM #TEST
)X
SELECT
S.ID,
S.START_DATE,
E.END_DATE,
S.SCORE
FROM #START S
LEFT JOIN #END E ON E.ID = S.ID AND S.SCORE = E.SCORE
WHERE S.R=1 AND E.R=1
ORDER BY 1,2
这是一个 gaps-and-islands 问题。在这种情况下,您正在寻找 重叠。当不存在重叠时,您就有了一个组的开始。累积总和定义组:
select id, score, min(start_date), max(end_date)
from (select t.*,
sum(case when prev_end_date >= dateadd(day, -1, start_date) then 0 else 1 end) over (partition by id, score order by start_date) as grouping
from (select t.*,
max(end_date) over (partition by id, score
order by start_date
rows between unbounded preceding and 1 preceding
) as prev_end_date
from #test t
) t
) t
group by id, score, grouping;
Here 是一个 db<>fiddle.
我有一组特征的一组非重叠日期范围。我想要做的是为重复出现的分数生成一个统一的日期范围。
以此为出发点。
ID START_DATE END_DATE SCORE
-------------------------------------
ABCDE 01/04/2014 30/04/2014 OK
ABCDE 01/05/2014 31/05/2014 OK
ABCDE 01/06/2014 30/06/2014 OK
ABCDE 01/07/2014 31/07/2014 OK
ABCDE 01/08/2014 31/08/2014 OK
ABCDE 01/09/2014 30/09/2014 GOOD
ABCDE 01/10/2014 31/10/2014 GOOD
ABCDE 01/11/2014 30/11/2014 GOOD
ABCDE 01/12/2014 31/12/2014 GOOD
ABCDE 01/01/2015 31/01/2015 GOOD
ABCDE 01/02/2015 28/02/2015 GOOD
ABCDE 01/03/2015 31/03/2015 GOOD
ABCDE 01/04/2015 30/04/2015 GOOD
ABCDE 01/05/2015 31/05/2015 GOOD
ABCDE 01/06/2015 30/06/2015 GOOD
ABCDE 01/07/2015 31/07/2015 GOOD
ABCDE 01/08/2015 31/08/2015 GOOD
ABCDE 01/09/2015 30/09/2015 GOOD
ABCDE 01/10/2015 31/10/2015 GOOD
ABCDE 01/11/2015 30/11/2015 GOOD
ABCDE 01/12/2015 31/12/2015 GOOD
ABCDE 01/01/2016 31/01/2016 GOOD
ABCDE 01/02/2016 29/02/2016 GOOD
ABCDE 01/03/2016 31/03/2016 GOOD
ABCDE 01/04/2016 30/04/2016 GOOD
ABCDE 01/05/2016 31/05/2016 GOOD
ABCDE 01/06/2016 30/06/2016 GOOD
ABCDE 01/07/2016 31/07/2016 GOOD
ABCDE 01/08/2016 31/08/2016 GOOD
ABCDE 01/09/2016 30/09/2016 GOOD
ABCDE 01/10/2016 31/10/2016 GOOD
ABCDE 01/11/2016 30/11/2016 GOOD
ABCDE 01/12/2016 31/12/2016 GOOD
ABCDE 01/01/2017 31/01/2017 GOOD
ABCDE 01/02/2017 28/02/2017 GOOD
ABCDE 01/03/2017 31/03/2017 GOOD
ABCDE 01/04/2017 30/04/2017 GOOD
ABCDE 01/05/2017 31/05/2017 GOOD
ABCDE 01/06/2017 30/06/2017 GOOD
ABCDE 01/07/2017 31/07/2017 GOOD
ABCDE 01/08/2017 31/08/2017 GOOD
ABCDE 01/09/2017 30/09/2017 GOOD
ABCDE 01/10/2017 31/10/2017 GOOD
ABCDE 01/11/2017 30/11/2017 GOOD
ABCDE 01/12/2017 31/12/2017 GOOD
ABCDE 01/01/2018 31/01/2018 GOOD
ABCDE 01/02/2018 28/02/2018 GOOD
ABCDE 01/03/2018 31/03/2018 GOOD
ABCDE 01/04/2018 30/04/2018 GOOD
ABCDE 01/05/2018 31/05/2018 GOOD
ABCDE 01/06/2018 30/06/2018 GOOD
ABCDE 01/07/2018 31/07/2018 GOOD
ABCDE 01/08/2018 31/08/2018 BAD
ABCDE 01/09/2018 30/09/2018 BAD
ABCDE 01/10/2018 31/10/2018 GOOD
ABCDE 01/11/2018 30/11/2018 GOOD
ABCDE 01/12/2018 31/12/2018 GOOD
ABCDE 01/01/2019 31/01/2019 GOOD
ABCDE 01/02/2019 28/02/2019 GOOD
ABCDE 01/03/2019 31/03/2019 GOOD
对此
ID START_DATE END_DATE SCORE
-------------------------------------
ABCDE 01/04/2014 31/08/2014 OK
ABCDE 01/09/2014 31/07/2018 GOOD
ABCDE 01/08/2018 30/09/2018 BAD
ABCDE 01/10/2018 31/03/2019 GOOD
这是我迄今为止尝试过的方法,但我无法获得最后的 'GOOD' 分数,因为我使用的是分区函数上的行,该函数仅给出分组 [=22] 的最小值和最大值=].任何帮助将不胜感激。
DROP TABLE #TEST
CREATE TABLE #TEST (
[ID] [varchar](10) NULL,
[START_DATE] [DATE] NULL,
[END_DATE] [date] NULL,
[SCORE] [varchar](10) NOT NULL
) ON [PRIMARY]
GO
INSERT INTO #TEST
VALUES
('ABCDE','2014-04-01','2014-04-30','OK'),
('ABCDE','2014-05-01','2014-05-31','OK'),
('ABCDE','2014-06-01','2014-06-30','OK'),
('ABCDE','2014-07-01','2014-07-31','OK'),
('ABCDE','2014-08-01','2014-08-31','OK'),
('ABCDE','2014-09-01','2014-09-30','GOOD'),
('ABCDE','2014-10-01','2014-10-31','GOOD'),
('ABCDE','2014-11-01','2014-11-30','GOOD'),
('ABCDE','2014-12-01','2014-12-31','GOOD'),
('ABCDE','2015-01-01','2015-01-31','GOOD'),
('ABCDE','2015-02-01','2015-02-28','GOOD'),
('ABCDE','2015-03-01','2015-03-31','GOOD'),
('ABCDE','2015-04-01','2015-04-30','GOOD'),
('ABCDE','2015-05-01','2015-05-31','GOOD'),
('ABCDE','2015-06-01','2015-06-30','GOOD'),
('ABCDE','2015-07-01','2015-07-31','GOOD'),
('ABCDE','2015-08-01','2015-08-31','GOOD'),
('ABCDE','2015-09-01','2015-09-30','GOOD'),
('ABCDE','2015-10-01','2015-10-31','GOOD'),
('ABCDE','2015-11-01','2015-11-30','GOOD'),
('ABCDE','2015-12-01','2015-12-31','GOOD'),
('ABCDE','2016-01-01','2016-01-31','GOOD'),
('ABCDE','2016-02-01','2016-02-29','GOOD'),
('ABCDE','2016-03-01','2016-03-31','GOOD'),
('ABCDE','2016-04-01','2016-04-30','GOOD'),
('ABCDE','2016-05-01','2016-05-31','GOOD'),
('ABCDE','2016-06-01','2016-06-30','GOOD'),
('ABCDE','2016-07-01','2016-07-31','GOOD'),
('ABCDE','2016-08-01','2016-08-31','GOOD'),
('ABCDE','2016-09-01','2016-09-30','GOOD'),
('ABCDE','2016-10-01','2016-10-31','GOOD'),
('ABCDE','2016-11-01','2016-11-30','GOOD'),
('ABCDE','2016-12-01','2016-12-31','GOOD'),
('ABCDE','2017-01-01','2017-01-31','GOOD'),
('ABCDE','2017-02-01','2017-02-28','GOOD'),
('ABCDE','2017-03-01','2017-03-31','GOOD'),
('ABCDE','2017-04-01','2017-04-30','GOOD'),
('ABCDE','2017-05-01','2017-05-31','GOOD'),
('ABCDE','2017-06-01','2017-06-30','GOOD'),
('ABCDE','2017-07-01','2017-07-31','GOOD'),
('ABCDE','2017-08-01','2017-08-31','GOOD'),
('ABCDE','2017-09-01','2017-09-30','GOOD'),
('ABCDE','2017-10-01','2017-10-31','GOOD'),
('ABCDE','2017-11-01','2017-11-30','GOOD'),
('ABCDE','2017-12-01','2017-12-31','GOOD'),
('ABCDE','2018-01-01','2018-01-31','GOOD'),
('ABCDE','2018-02-01','2018-02-28','GOOD'),
('ABCDE','2018-03-01','2018-03-31','GOOD'),
('ABCDE','2018-04-01','2018-04-30','GOOD'),
('ABCDE','2018-05-01','2018-05-31','GOOD'),
('ABCDE','2018-06-01','2018-06-30','GOOD'),
('ABCDE','2018-07-01','2018-07-31','GOOD'),
('ABCDE','2018-08-01','2018-08-31','BAD'),
('ABCDE','2018-09-01','2018-09-30','BAD'),
('ABCDE','2018-10-01','2018-10-31','GOOD'),
('ABCDE','2018-11-01','2018-11-30','GOOD'),
('ABCDE','2018-12-01','2018-12-31','GOOD'),
('ABCDE','2019-01-01','2019-01-31','GOOD'),
('ABCDE','2019-02-01','2019-02-28','GOOD'),
('ABCDE','2019-03-01','2019-03-31','GOOD')
DROP TABLE #START
SELECT * INTO #START FROM (
SELECT ID
,[START_DATE]
,[END_DATE]
,SCORE
,ROW_NUMBER() OVER (PARTITION BY ID,SCORE ORDER BY START_DATE ASC) AS R
FROM #TEST
)X
DROP TABLE #END
SELECT * INTO #END FROM (
SELECT ID
,[START_DATE]
,[END_DATE]
,SCORE
,ROW_NUMBER() OVER (PARTITION BY ID,SCORE ORDER BY START_DATE DESC) AS R
FROM #TEST
)X
SELECT
S.ID,
S.START_DATE,
E.END_DATE,
S.SCORE
FROM #START S
LEFT JOIN #END E ON E.ID = S.ID AND S.SCORE = E.SCORE
WHERE S.R=1 AND E.R=1
ORDER BY 1,2
这是一个 gaps-and-islands 问题。在这种情况下,您正在寻找 重叠。当不存在重叠时,您就有了一个组的开始。累积总和定义组:
select id, score, min(start_date), max(end_date)
from (select t.*,
sum(case when prev_end_date >= dateadd(day, -1, start_date) then 0 else 1 end) over (partition by id, score order by start_date) as grouping
from (select t.*,
max(end_date) over (partition by id, score
order by start_date
rows between unbounded preceding and 1 preceding
) as prev_end_date
from #test t
) t
) t
group by id, score, grouping;
Here 是一个 db<>fiddle.