如何根据ID列区分连续和非连续日期范围

How to differentiate the continuous and non-continuous date ranges based on ID column

ID  STRT_DT, ENT_DT 
1 9/14/2020,10/5/2020
1 10/6/2020,10/8/2020
1 10/9/2020,12/31/2199
2 7/14/2020,11/5/2020
2 11/21/2020,11/22/2020
2 11/23/2020,12/31/2199

观察ID 1和2的上述数据,属于1的日期范围是连续的,ID 2是不连续的。我需要提取 SQL 中连续的 ID。 预期 o/p :如果任何日期范围不连续(按 ID 分组),则不应进入 select 子句。所以 SQL 输出的期望是得到 ID=1

查询使用:

SELECT tab.ID,TAB.STRT_DT,TAB.ENT_DT,
STRT_DT - MIN(ENT_DT) OVER (PARTITION BY ID ORDER BY ENT_DT ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS diff, 
ENT_DT - MAX(STRT_DT) OVER (PARTITION BY ID ORDER BY ENT_DT ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS diff2 
FROM  tabLE QUALIFY diff <> 1 OR diff2 <> -1

您可以在没有 window 功能的情况下使用自连接。

select t1.id,
       t2.start_dt prev_start_dt, t2.end_dt prev_end_dt,
       t1.start_dt, t1.end_dt,
       to_date(t1.start_dt, 'MM/DD/YYYY') - to_date(t2.end_dt, 'MM/DD/YYYY') diff
from   t t1 inner join t t2 on t1.id = t2.id
where  to_date(t1.start_dt, 'MM/DD/YYYY') - to_date(t2.end_dt, 'MM/DD/YYYY') = 1
order by t1.id, t1.start_dt

结果:

ID  PREV_START_DT   PREV_END_DT START_DT    END_DT  DIFF
1   9/14/2020   10/5/2020   10/6/2020   10/8/2020   1
1   10/6/2020   10/8/2020   10/9/2020   12/31/2199  1
2   11/21/2020  11/22/2020  11/23/2020  12/31/2199  1

演示:https://dbfiddle.uk/?rdbms=oracle_18&fiddle=e20d48300f81e746826e44d8ee6982be


如果只想获取所有行连续的ID,可以使用left join查看连接行和未连接行。

select t1.id
from   t t1 left join t t2 
            on t1.id = t2.id
            and to_date(t1.start_dt, 'MM/DD/YYYY') - to_date(t2.end_dt, 'MM/DD/YYYY') = 1
group by t1.id
having count(t1.id) - 1 = count(t2.id)

结果:

ID
1

演示:https://dbfiddle.uk/?rdbms=oracle_18&fiddle=85679be30f170e3728d7e8b0b0da533e

select ID
from
 (
   select 
      ID, 
      -- flag non-continous ranges, i.e. previous end is not equal to the day before current start
      case when STRT_DT - 1
            <> LAG(ENT_DT) OVER (PARTITION BY ID ORDER BY STRT_DT)
           then 1
           else 0
      end as flag
   from table
 ) as dt
group by ID
having sum(flag) = 0 -- only continous ranges exist