需要帮助编写 Teradata SQL 来验证从多个 SCD 类型 2 表组合的多个日期范围
Need help to write an Teradata SQL to validate mutiple date ranges combined from more than one SCD type2 tables
我想验证从多个 SCD type2 table 派生的视图。因此,我必须根据两个(或更多)table 中可用日期的组合创建一个日期范围。请参阅下面的示例。
Table 1:
hid h1name h1_strt_dttm h1_end_dttm
5 value5 2/17/2008 18:19:50 2/16/2009 23:59:59:59
5 value5-upd1 2/17/2009 00:00:00 4/11/2011 23:59:59
5 value5-upd2 4/12/2011 00:00:00 3/5/2012 23:59:59
5 value5-upd3 3/6/2012 00:00:00 11/11/2012 23:59:59
5 value5-upd4 11/12/2012 00:00:00 12/31/9999 23:59:59
Table 2:
hid h2name h2_strt_dttm h2_end_dttm
5 name5 3/7/2008 8:14:44 7/17/2010 23:59:59
5 name5-upd1 7/18/2010 0:00:00 1/1/2011 23:59:59
5 name5-upd2 1/2/2011 0:00:00 3/5/2013 23:59:59
5 name5-upd3 3/6/2013 0:00:00 12/31/9999 23:59:59
结果集:
id strt_dttm end_dttm h1name h2name
5 2/17/2008 18:19:50 3/7/2008 8:14:43 value5 null
5 3/7/2008 8:14:44 2/16/2009 23:59:59 value5 name5
5 2/17/2009 00:00:00 7/17/2010 23:59:59 value5-upd1 name5
5 7/18/2010 00:00:00 1/1/2011 23:59:59 value5-upd1 name5-upd1
5 1/2/2011 00:00:00 4/11/2011 23:59:59 value5-upd1 name5-upd2
5 4/12/2011 00:00:00 3/5/2012 23:59:59 value5-upd2 name5-upd2
5 3/6/2012 00:00:00 11/11/2012 23:59:59 value5-upd3 name5-upd2
5 11/12/2012 00:00:00 3/5/2013 23:59:59 value5-upd4 name5-upd2
5 3/6/2013 00:00:00 1/1/9999 00:00:00 value5-upd4 name5-upd3
实现步骤:我的方法:
1) 将两个(或更多)table 中所有可用的 STRT_DTTM 字段值合并。
2) Distribute/recreate 日期范围为上述步骤 1 中可用的日期。(作为开始日期和结束日期)
3) 然后尝试在新创建的日期范围内识别 CDC 字段 (h1name & h2name ) 并将它们放入结果集中。
查询:
第 1 步:
sel h1.hid hid , h1_strt_dttm dtm
from HIST1_DAR h1
union
sel h2.hid hid ,h2_strt_dttm dtm
from HIST2_DAR h2
第 2 步:
/*************** Step 2 ********************/
WITH ALL_DATE_CTE (hid,dttm,strt_dttm,end_dttm) AS (
sel hid,
dtm+INTERVAL '0.000001' SECOND as dttm ,
dttm-INTERVAL '0.000001' SECOND as strt_dttm,
coalesce(min(strt_dttm-INTERVAL '1' SECOND) over (partition by hid
order by dttm ROWS BETWEEN 1 following
AND 1 following ), cast('9999-01-01 00:00:00.000000' as timestamp(6)) ) as end_dttm
from
(
/*************** Step 1 ********************/
sel h1.hid hid , h1_strt_dttm dtm
from DBNAME.HIST1_DAR h1
union
sel h2.hid hid ,h2_strt_dttm dtm
from DBNAME.HIST2_DAR h2
/*************** Step 1 ********************/ ) a
order by 1,2 )
/**************** Step 2********************/
第 3 步:
/**************** Step 3********************/
sel
h1.hid id,
cte.strt_dttm,
cte.end_dttm,
h1.h1name,
h1_strt_dttm , h1_end_dttm,
h2.h2name,
h2_strt_dttm,h2_end_dttm
from DBNAME.HIST1_DAR h1
inner join ALL_DATE_CTE cte
on h1.hid = cte.hid
and cte.dttm between h1_strt_dttm and h1_end_dttm
inner join DBNAME.HIST2_DAR h2
on cte.hid = h2.hid
and cte.dttm between h2_strt_dttm and h2_end_dttm
where h1.hid = 5;
/**************** Step 3********************/
现在问题出在第 2 步:我无法在子查询和 CTE 中使用 order by。没有订购开始日期,我无法正确列出日期范围。
任何人都可以指导我解决方案或其他简单方法。请注意,因为我是 QA,所以我没有对大多数数据库的 create table 访问权限。因此,任何使用 select 的建议都会更有价值。
我已经包含了我用于 table 创建的 DDL 和 DML。
create table DBNAME. hist1_dar ( hid integer , h1name varchar(20), h1_strt_dttm timestamp(6), h1_end_dttm timestamp(6));
create table DBNAME. hist2_dar ( hid integer , h2name varchar(20), h2_strt_dttm timestamp(6), h2_end_dttm timestamp(6));
ins DBNAME. hist1_dar (1, 'value1' , '1987-07-02 08:30:00.000000', '1989-12-02 23:59:59.000000');
ins DBNAME. hist1_dar (2, 'value2' , '1997-09-12 05:20:10.000000' , '1999-12-12 23:59:59.000000');
ins DBNAME. hist1_dar (3, 'value3' , '2000-06-30 07:10:50.000000', '2001-02-10 23:59:59.000000');
ins DBNAME. hist1_dar (4, 'value4' , '2006-05-23 23:32:13.000000', '2007-07-03 23:59:59.000000');
ins DBNAME. hist1_dar (5, 'value5' , '2009-02-17 18:19:50.000000', '2011-04-11 11:09:59.000000' );
ins DBNAME. hist1_dar (1, 'value1-upd1' , '1990-01-01 01:01:01.000000', '9999-12-31 23:59:59.000000');
ins DBNAME. hist1_dar (2, 'value2-upd1' , '2000-01-01 01:01:01.000000' , '9999-12-31 23:59:59.000000');
ins DBNAME. hist1_dar (3, 'value3-upd1' , '2001-02-11 01:01:01.000000', '2002-07-12 23:59:59.000000');
ins DBNAME. hist1_dar (3, 'value3-upd2' , '2002-07-13 01:01:01.000000', '9999-12-31 23:59:59.000000');
ins DBNAME. hist1_dar (4, 'value4-upd1' , '2007-07-03 01:01:01.000000' , '2007-10-17 23:59:59.000000' );
ins DBNAME. hist1_dar (4, 'value4-upd2' , '2007-10-18 01:01:01.000000' , '2007-12-23 23:59:59.000000' );
ins DBNAME. hist1_dar (4, 'value4-upd3' , '2007-12-24 01:01:01.000000' , '9999-12-31 23:59:59.000000');
ins DBNAME. hist1_dar (5, 'value5-upd1' , '2009-02-17 01:01:01.000000', '2011-04-11 23:59:59.000000' );
ins DBNAME. hist1_dar (5, 'value5-upd2' , '2011-04-11 01:01:01.000000', '2012-03-05 23:59:59.000000' );
ins DBNAME. hist1_dar (5, 'value5-upd3' , '2012-03-05 01:01:01.000000', '2012-11-11 23:59:59.000000' );
ins DBNAME. hist1_dar (5, 'value5-upd4' , '2012-11-12 01:01:01.000000' , '9999-12-31 23:59:59.000000');
ins DBNAME. hist2_dar (1, 'name1' , '1988-08-12 18:20:10.000000', '1990-03-02 23:59:59.000000');
ins DBNAME. hist2_dar (2, 'name2' , '1993-05-19 15:12:30.000000', '1999-11-01 23:59:59.000000');
ins DBNAME. hist2_dar (3, 'name3' , '2002-09-20 17:19:57.000000', '2003-04-10 23:59:59.000000');
ins DBNAME. hist2_dar (4, 'name4' , '2004-04-18 13:38:23.000000' , '2005-07-28 23:59:59.000000');
ins DBNAME. hist2_dar (5, 'name5' , '2008-03-07 08:14:44.000000', '2010-07-17 23:59:59.000000');
ins DBNAME. hist2_dar (1, 'name1-upd1' , '1990-03-03 01:01:01.000000', '9999-12-31 23:59:59.000000');
ins DBNAME. hist2_dar (2, 'name2-upd1' , '1999-11-02 01:01:01.000000', '9999-12-31 23:59:59.000000');
ins DBNAME. hist2_dar (3, 'name3-upd1' , '2003-04-11 01:01:01.000000' , '9999-12-31 23:59:59.000000');
ins DBNAME. hist2_dar (4, 'name4-upd1' , '2005-07-28 21:28:56.000000' , '2008-08-19 23:59:59.000000' );
ins DBNAME. hist2_dar (4, 'name4-upd2' , '2008-08-20 01:01:01.000000' , '2007-12-23 23:59:59.000000' );
ins DBNAME. hist2_dar (5, 'name5-upd1' , '2010-07-18 01:01:01.000000', '2011-01-01 23:59:59.000000' );
ins DBNAME. hist2_dar (5, 'name5-upd2' , '2011-01-02 01:01:01.000000', '2013-03-05 23:59:59.000000' );
ins DBNAME. hist2_dar (5, 'name5-upd3' , '2013-03-06 01:01:01.000000', '9999-12-31 23:59:59.000000' );
您的 INSERT 与您的示例不匹配。
如果您的第 2 步 returns 数据正确,您只需添加 LAST_VALUE:
WITH cte (
hid
,strt_dttm
,end_dttm
,h1name
,h2name
)
AS (
SELECT hid
,dttm
,coalesce(min(dttm - INTERVAL '1' SECOND) OVER (
PARTITION BY hid ORDER BY dttm ROWS BETWEEN 1 following
AND 1 following
), TIMESTAMP '9999-01-01 00:00:00') AS end_dttm
,h1name
,h2name
FROM (
/*************** Step 1 ********************/
SELECT h1.hid hid
,h1_strt_dttm dttm
,h1name
,cast(NULL AS VARCHAR(20)) AS h2name
FROM HIST1_DAR h1
UNION ALL
SELECT h2.hid hid
,h2_strt_dttm dttm
,cast(NULL AS VARCHAR(20))
,h2name
FROM HIST2_DAR h2
/*************** Step 1 ********************/
) a
--order by 1,2
)
SELECT hid
,strt_dttm
,end_dttm
,last_value(h1name ignore nulls) OVER (
PARTITION BY hid ORDER BY strt_dttm
)
,last_value(h2name ignore nulls) OVER (
PARTITION BY hid ORDER BY strt_dttm
)
FROM cte
我想验证从多个 SCD type2 table 派生的视图。因此,我必须根据两个(或更多)table 中可用日期的组合创建一个日期范围。请参阅下面的示例。
Table 1:
hid h1name h1_strt_dttm h1_end_dttm
5 value5 2/17/2008 18:19:50 2/16/2009 23:59:59:59
5 value5-upd1 2/17/2009 00:00:00 4/11/2011 23:59:59
5 value5-upd2 4/12/2011 00:00:00 3/5/2012 23:59:59
5 value5-upd3 3/6/2012 00:00:00 11/11/2012 23:59:59
5 value5-upd4 11/12/2012 00:00:00 12/31/9999 23:59:59
Table 2:
hid h2name h2_strt_dttm h2_end_dttm
5 name5 3/7/2008 8:14:44 7/17/2010 23:59:59
5 name5-upd1 7/18/2010 0:00:00 1/1/2011 23:59:59
5 name5-upd2 1/2/2011 0:00:00 3/5/2013 23:59:59
5 name5-upd3 3/6/2013 0:00:00 12/31/9999 23:59:59
结果集:
id strt_dttm end_dttm h1name h2name
5 2/17/2008 18:19:50 3/7/2008 8:14:43 value5 null
5 3/7/2008 8:14:44 2/16/2009 23:59:59 value5 name5
5 2/17/2009 00:00:00 7/17/2010 23:59:59 value5-upd1 name5
5 7/18/2010 00:00:00 1/1/2011 23:59:59 value5-upd1 name5-upd1
5 1/2/2011 00:00:00 4/11/2011 23:59:59 value5-upd1 name5-upd2
5 4/12/2011 00:00:00 3/5/2012 23:59:59 value5-upd2 name5-upd2
5 3/6/2012 00:00:00 11/11/2012 23:59:59 value5-upd3 name5-upd2
5 11/12/2012 00:00:00 3/5/2013 23:59:59 value5-upd4 name5-upd2
5 3/6/2013 00:00:00 1/1/9999 00:00:00 value5-upd4 name5-upd3
实现步骤:我的方法:
1) 将两个(或更多)table 中所有可用的 STRT_DTTM 字段值合并。
2) Distribute/recreate 日期范围为上述步骤 1 中可用的日期。(作为开始日期和结束日期)
3) 然后尝试在新创建的日期范围内识别 CDC 字段 (h1name & h2name ) 并将它们放入结果集中。 查询:
第 1 步:
sel h1.hid hid , h1_strt_dttm dtm
from HIST1_DAR h1
union
sel h2.hid hid ,h2_strt_dttm dtm
from HIST2_DAR h2
第 2 步:
/*************** Step 2 ********************/
WITH ALL_DATE_CTE (hid,dttm,strt_dttm,end_dttm) AS (
sel hid,
dtm+INTERVAL '0.000001' SECOND as dttm ,
dttm-INTERVAL '0.000001' SECOND as strt_dttm,
coalesce(min(strt_dttm-INTERVAL '1' SECOND) over (partition by hid
order by dttm ROWS BETWEEN 1 following
AND 1 following ), cast('9999-01-01 00:00:00.000000' as timestamp(6)) ) as end_dttm
from
(
/*************** Step 1 ********************/
sel h1.hid hid , h1_strt_dttm dtm
from DBNAME.HIST1_DAR h1
union
sel h2.hid hid ,h2_strt_dttm dtm
from DBNAME.HIST2_DAR h2
/*************** Step 1 ********************/ ) a
order by 1,2 )
/**************** Step 2********************/
第 3 步:
/**************** Step 3********************/
sel
h1.hid id,
cte.strt_dttm,
cte.end_dttm,
h1.h1name,
h1_strt_dttm , h1_end_dttm,
h2.h2name,
h2_strt_dttm,h2_end_dttm
from DBNAME.HIST1_DAR h1
inner join ALL_DATE_CTE cte
on h1.hid = cte.hid
and cte.dttm between h1_strt_dttm and h1_end_dttm
inner join DBNAME.HIST2_DAR h2
on cte.hid = h2.hid
and cte.dttm between h2_strt_dttm and h2_end_dttm
where h1.hid = 5;
/**************** Step 3********************/
现在问题出在第 2 步:我无法在子查询和 CTE 中使用 order by。没有订购开始日期,我无法正确列出日期范围。
任何人都可以指导我解决方案或其他简单方法。请注意,因为我是 QA,所以我没有对大多数数据库的 create table 访问权限。因此,任何使用 select 的建议都会更有价值。 我已经包含了我用于 table 创建的 DDL 和 DML。
create table DBNAME. hist1_dar ( hid integer , h1name varchar(20), h1_strt_dttm timestamp(6), h1_end_dttm timestamp(6));
create table DBNAME. hist2_dar ( hid integer , h2name varchar(20), h2_strt_dttm timestamp(6), h2_end_dttm timestamp(6));
ins DBNAME. hist1_dar (1, 'value1' , '1987-07-02 08:30:00.000000', '1989-12-02 23:59:59.000000');
ins DBNAME. hist1_dar (2, 'value2' , '1997-09-12 05:20:10.000000' , '1999-12-12 23:59:59.000000');
ins DBNAME. hist1_dar (3, 'value3' , '2000-06-30 07:10:50.000000', '2001-02-10 23:59:59.000000');
ins DBNAME. hist1_dar (4, 'value4' , '2006-05-23 23:32:13.000000', '2007-07-03 23:59:59.000000');
ins DBNAME. hist1_dar (5, 'value5' , '2009-02-17 18:19:50.000000', '2011-04-11 11:09:59.000000' );
ins DBNAME. hist1_dar (1, 'value1-upd1' , '1990-01-01 01:01:01.000000', '9999-12-31 23:59:59.000000');
ins DBNAME. hist1_dar (2, 'value2-upd1' , '2000-01-01 01:01:01.000000' , '9999-12-31 23:59:59.000000');
ins DBNAME. hist1_dar (3, 'value3-upd1' , '2001-02-11 01:01:01.000000', '2002-07-12 23:59:59.000000');
ins DBNAME. hist1_dar (3, 'value3-upd2' , '2002-07-13 01:01:01.000000', '9999-12-31 23:59:59.000000');
ins DBNAME. hist1_dar (4, 'value4-upd1' , '2007-07-03 01:01:01.000000' , '2007-10-17 23:59:59.000000' );
ins DBNAME. hist1_dar (4, 'value4-upd2' , '2007-10-18 01:01:01.000000' , '2007-12-23 23:59:59.000000' );
ins DBNAME. hist1_dar (4, 'value4-upd3' , '2007-12-24 01:01:01.000000' , '9999-12-31 23:59:59.000000');
ins DBNAME. hist1_dar (5, 'value5-upd1' , '2009-02-17 01:01:01.000000', '2011-04-11 23:59:59.000000' );
ins DBNAME. hist1_dar (5, 'value5-upd2' , '2011-04-11 01:01:01.000000', '2012-03-05 23:59:59.000000' );
ins DBNAME. hist1_dar (5, 'value5-upd3' , '2012-03-05 01:01:01.000000', '2012-11-11 23:59:59.000000' );
ins DBNAME. hist1_dar (5, 'value5-upd4' , '2012-11-12 01:01:01.000000' , '9999-12-31 23:59:59.000000');
ins DBNAME. hist2_dar (1, 'name1' , '1988-08-12 18:20:10.000000', '1990-03-02 23:59:59.000000');
ins DBNAME. hist2_dar (2, 'name2' , '1993-05-19 15:12:30.000000', '1999-11-01 23:59:59.000000');
ins DBNAME. hist2_dar (3, 'name3' , '2002-09-20 17:19:57.000000', '2003-04-10 23:59:59.000000');
ins DBNAME. hist2_dar (4, 'name4' , '2004-04-18 13:38:23.000000' , '2005-07-28 23:59:59.000000');
ins DBNAME. hist2_dar (5, 'name5' , '2008-03-07 08:14:44.000000', '2010-07-17 23:59:59.000000');
ins DBNAME. hist2_dar (1, 'name1-upd1' , '1990-03-03 01:01:01.000000', '9999-12-31 23:59:59.000000');
ins DBNAME. hist2_dar (2, 'name2-upd1' , '1999-11-02 01:01:01.000000', '9999-12-31 23:59:59.000000');
ins DBNAME. hist2_dar (3, 'name3-upd1' , '2003-04-11 01:01:01.000000' , '9999-12-31 23:59:59.000000');
ins DBNAME. hist2_dar (4, 'name4-upd1' , '2005-07-28 21:28:56.000000' , '2008-08-19 23:59:59.000000' );
ins DBNAME. hist2_dar (4, 'name4-upd2' , '2008-08-20 01:01:01.000000' , '2007-12-23 23:59:59.000000' );
ins DBNAME. hist2_dar (5, 'name5-upd1' , '2010-07-18 01:01:01.000000', '2011-01-01 23:59:59.000000' );
ins DBNAME. hist2_dar (5, 'name5-upd2' , '2011-01-02 01:01:01.000000', '2013-03-05 23:59:59.000000' );
ins DBNAME. hist2_dar (5, 'name5-upd3' , '2013-03-06 01:01:01.000000', '9999-12-31 23:59:59.000000' );
您的 INSERT 与您的示例不匹配。
如果您的第 2 步 returns 数据正确,您只需添加 LAST_VALUE:
WITH cte (
hid
,strt_dttm
,end_dttm
,h1name
,h2name
)
AS (
SELECT hid
,dttm
,coalesce(min(dttm - INTERVAL '1' SECOND) OVER (
PARTITION BY hid ORDER BY dttm ROWS BETWEEN 1 following
AND 1 following
), TIMESTAMP '9999-01-01 00:00:00') AS end_dttm
,h1name
,h2name
FROM (
/*************** Step 1 ********************/
SELECT h1.hid hid
,h1_strt_dttm dttm
,h1name
,cast(NULL AS VARCHAR(20)) AS h2name
FROM HIST1_DAR h1
UNION ALL
SELECT h2.hid hid
,h2_strt_dttm dttm
,cast(NULL AS VARCHAR(20))
,h2name
FROM HIST2_DAR h2
/*************** Step 1 ********************/
) a
--order by 1,2
)
SELECT hid
,strt_dttm
,end_dttm
,last_value(h1name ignore nulls) OVER (
PARTITION BY hid ORDER BY strt_dttm
)
,last_value(h2name ignore nulls) OVER (
PARTITION BY hid ORDER BY strt_dttm
)
FROM cte