从重叠的日期范围创建新的日期范围并分配一个 ID

Create new date ranges from overlapping date ranges and assign an ID

我有以下table

ID  | START_DATE | END_DATE   | FEATURE
---------------------------------------
001 | 1995-08-01 | 1997-12-31 | 1
001 | 1998-01-01 | 2017-03-31 | 4
001 | 2000-06-14 | 2017-03-31 | 5
001 | 2013-04-01 | 2017-03-31 | 8
002 | 1929-10-01 | 2006-05-25 | 1
002 | 2006-05-26 | 2016-11-10 | 4
002 | 2006-05-26 | 2016-11-10 | 7
002 | 2013-04-01 | 2016-11-10 | 8

我想将此 table 转换为合并的 table,它将查找重叠的日期范围,然后将它们合并到新行中。创建一组不重叠的日期范围。

我最需要帮助的一点是 'feature' 列的合并,它将每个功能连接成下面的格式。

ID  | START_DATE | END_DATE   | FEATURE
---------------------------------------
001 | 1995-08-01 | 1997-12-31 | 1
001 | 1998-01-01 | 2000-06-13 | 4
001 | 2000-06-14 | 2013-03-31 | 45
001 | 2013-04-01 | 2017-03-31 | 458
002 | 1929-10-01 | 2006-05-25 | 1
002 | 2006-05-26 | 2013-03-31 | 47
002 | 2013-04-01 | 2016-11-10 | 478

我使用以下方法创建了测试数据。

CREATE TABLE #TEST (
    [ID] [varchar](10) NULL,
    [START_DATE] [date] NULL,
    [END_DATE] [date] NULL,
    [FEATURE] [int] NOT NULL
) ON [PRIMARY]
GO


INSERT INTO #TEST

VALUES

('001','1998-01-01','2017-03-31',4),
('001','2000-06-14','2017-03-31',5),
('001','2013-04-01','2017-03-31',8),
('001','1995-08-01','1997-12-31',1),
('002','2006-05-26','2016-11-10',4),
('002','2006-05-26','2016-11-10',7),
('002','2013-04-01','2016-11-10',8),
('002','1929-10-01','2006-05-25',1)

这是一个将设置 DATE_END 的查询。看起来您正在使用 SQL Server,但如果不进行或进行少量修改,几乎每个数据库都会 运行。

with grouped_data as
(
    select ID, START_DATE, END_DATE from #TEST group by ID, START_DATE, END_DATE
) 
,cte as
(
    select 
        *,
        ROW_NUMBER() over (partition by ID order by start_date) as nr
    from grouped_data
)
select 
     c1.ID
    ,c1.START_DATE
    ,case when c1.nr <> 1 then isnull(DATEADD(DAY, -1, c2.START_DATE), c1.END_DATE) ELSE c1.END_DATE end as END_DATE  
from cte as c1
left join cte as c2
    on c1.ID = c2.ID
    and c1.nr = c2.nr -1
order by c1.ID

如果您有 SQL Server 2017,您可以使用 STRING_AGG 轻松转换 FEATURE

您可以使用 apply :

select distinct t.id, t.START_DATE, t.END_DATE, coalesce(tt.feature, t.feature) as feature
from #test t outer apply
     ( select ' '+t1.feature
       from #test t1 
       where t1.id = t.id and t1.end_date = t.end_date and t1.start_date <= t.start_date
       order by t1.start_date
       for xml path('')
     ) tt(feature)
order by t.id, t.START_DATE;

这里是db<>fiddle