Teradata NORMALIZE 选项

Teradata NORMALIZE options

我一直在使用 td_normalize_overlap_meet 来折叠经期。我在论坛上看到了一些使用 CNT 来识别折叠的周期数的示例,并且我一直在搜索文档以查看是否存在其他类似的功能但没有找到任何东西。我正在专门寻找一个可以保留最后一个崩溃期间的开始日期的人。例如,假设我有这些经期:

+-----------+------------+
|Start Date | End Date   |
+-----------+------------+
|2018-01-02 | 2018-01-04 |
|2018-01-05 | 2018-01-07 |
|2018-01-08 | 2018-01-10 |
+-----------+------------+

然后我将它们折叠成这样:

+-----------+------------+-----+
|Start Date | End Date   | CNT |
+-----------+------------+-----+
|2018-01-02 | 2018-01-10 | 3   |
+-----------+------------+-----+

有没有类似CNT的功能可以给我这个?

+-----------+------------+-----+------------------------------------+
|Start Date | End Date   | CNT | Last Collapsed Period's Start Date |
+-----------+------------+-----+------------------------------------+
|2018-01-02 | 2018-01-10 | 3   | 2018-01-08                         |
+-----------+------------+-----+------------------------------------+

SELECT NORMALIZE 是非常简单的语法(尤其是与那些 td_normalize... 函数相比),但不能用于获取 行数 最后一行的开始日期

获得所需结果的最简单方法是应用 nPath table 运算符。假设要规范化的行不止一组:

WITH cte AS 
 ( -- the base Select creating the not yet normalized rows
   SELECT *
   FROM mytab
 )
SELECT * 
FROM 
   NPath(ON cte
         PARTITION BY col -- grouping column(s)
         ORDER BY Start_date
         USING
           MODE (NonOverlapping)
           Symbols (start_date-1 > lag(end_date, 1, date '0001-01-01') AS newgrp, -- starting row of a group of overlapping rows
                    start_date-1 <= lag(end_date, 1, ) as x)                      -- overlapping row
           Pattern ('newgrp.x*')                                                  -- start plus overlapping rows
           RESULT(First (col OF newgrp) AS col,                                   -- grouping column(s)
                  first (start_date OF ANY(newgrp, x)) AS start_date,             -- start date of group 
                  last  (end_date   OF ANY(newgrp, x)) AS end_date,               -- end date of group
                  Count (*          OF ANY(newgrp, x)) AS Cnt,                    -- number of rows in group
                  last  (start_date OF ANY(newgrp, x)) AS last_start              -- start date of last row in group
                 )
        );