Select 并按日期范围跳过行并检查它们之间的差异

Select and skip rows by date range with checking difference between them

我是 Teradata 的新手,我有一个小 sql 问题, 类似于下面的这个:

来源table答:

a|b|c|  dt      |dt_f
-------------------------
1|1|5|30/01/2020|21/02/2020
1|1|2|28/02/2020|19/03/2020
1|1|2|20/03/2020|17/04/2020
1|1|2|19/04/2020|05/05/2020
1|1|2|30/06/2020|24/07/2020
1|1|2|27/07/2020|31/12/2999

需要输出:

a|b|c|    dt    |dt_f
------------------------------
1|1|5|30/01/2020|**27/02/2020**
1|1|2|28/02/2020|**19/05/2020**
1|1|2|30/06/2020|**31/12/2999**

解释:

1 --> 如果 c 不同(在当前行和下一行之间),那么当前行的 dt_f = 下一行的 dt - 1 天,则选择两行

2--> if months_between(dt,dt) > 1(在行号 4 和行号 5 之间的示例中)所以选择的第一行的 dt 具有相同的 id a,b c 将是 df(第 4 行)+ 1 个月 并且第 5 行将被选中 dt_f = 31/12/2999.

我尝试了很多使用递归但我没有得到真正的结果,但我相信它可以用它来解决。

感谢您的回复:)

如果这只是返回重叠的行,使用 Teradata 的 NORMALIZE 扩展会非常简单:

CREATE VOLATILE TABLE vt 
(a INT, b INT, c INT, dt DATE, dt_f DATE)
ON COMMIT PRESERVE ROWS;

INSERT INTO vt(1, 1, 5, DATE '2020-01-30', DATE '2020-02-21');
INSERT INTO vt(1, 1, 2, DATE '2020-02-28', DATE '2020-03-19');
INSERT INTO vt(1, 1, 2, DATE '2020-03-20', DATE '2020-04-17');
INSERT INTO vt(1, 1, 2, DATE '2020-04-19', DATE '2020-05-05');
INSERT INTO vt(1, 1, 2, DATE '2020-06-30', DATE '2020-07-24');
INSERT INTO vt(1, 1, 2, DATE '2020-07-27', DATE '2999-12-31'); 

WITH cte AS 
 ( -- adjusting for gaps > 1 month
   SELECT NORMALIZE a,b,c
     ,PERIOD(dt, Add_Months(dt_f,1)) AS pd
   FROM vt
 )
SELECT a,b,c
  ,Begin(pd) AS dt
  ,Add_Months(End(pd),-1) AS dt_f
FROM cte
;

但是您调整结束日期的逻辑需要解析函数。这可能是最简单的查询,用于获取重叠的句点和附加列,修改后符合您的逻辑:

WITH cte AS
 ( -- returns both start/end of an island, but in seperate rows
   SELECT 
      a,b,c
     ,dt        -- start of current island
     ,Max(dt_f) -- end of previous island (used for finding gaps)
      Over (PARTITION BY a,b,c
            ORDER BY dt
            ROWS BETWEEN Unbounded Preceding
                     AND 1 Preceding) AS prev_max_end
     ,Lag(dt)   -- to adjust end date in case of gap > 1 month
      Over (PARTITION BY a,b,c
            ORDER BY dt) AS prev_dt
   FROM vt
   QUALIFY Add_Months(prev_max_end,1) < dt -- gap found
        OR prev_max_end IS NULL            -- first row
 )
SELECT
   a,b,c
  ,dt -- start of current island

   -- next row has end of current island
  ,CASE
     WHEN Lead(c ) -- change in c column?
          Over (PARTITION BY a,b
                ORDER BY dt) <> c 
     THEN Lead(dt) -- start of next island - 1
          Over (PARTITION BY a,b
                ORDER BY dt) -1
     ELSE -- 
          Lead(Add_Months(prev_dt,1),1,DATE '2999-12-31')
          Over (PARTITION BY a,b
                ORDER BY dt)
   END
FROM cte
;