如果结果保持不变,如何获取最小日期和最大日期
How to get min date and max date if results remain unchanged
我有一个地址 SCD 类型 2 table 但有时输入的信息保持不变13=]
OBJID BEGDA ENDDA HASHROW_COL RK
83022088 2012-03-30 2012-10-28 e1-ef-a9-36 1
83022088 2012-10-29 2013-09-07 63-69-e5-25 2
83022088 2013-09-08 2014-08-30 e1-ef-a9-36 3
83022088 2014-08-31 2016-11-26 e1-ef-a9-36 4
83022088 2016-11-27 9999-12-31 e1-ef-a9-36 5
请注意,第 3 – 5 行的 HASHROW_COL 保持不变。
Desired result:
OBJID BEGDA ENDDA HASHROW_COL RK
83022088 2012-03-30 2012-10-28 e1-ef-a9-36 1
83022088 2012-10-29 2013-09-07 63-69-e5-25 2
83022088 2013-09-08 9999-12-31 e1-ef-a9-36 3
目前查询
select a.objid, a.hashrow_col,
case when a.objid <> b.objid then b.begda
when a.hashrow_col = b.hashrow_col and (b.begda - interval '1' day <= a.endda) then
a.begda end,
case when a.objid <> b.objid then b.endda
when (a.hashrow_col = b.hashrow_col) and (b.begda - interval '1' day <= a.endda) and b.endda > a.endda
then b.endda
end,
from
(select objid, begda, endda, HASHROW_COL,
from OTABLE ) a
inner join
(select objid, begda, endda, HASHROW_COL,
from OTABLE) b
on
a.objid = b.objid
where
and a.objid = '83022088'
order by a.OBJID, a.BEGDA, a.HASHROW_COL;
这是一个缺口和孤岛问题。在这种情况下,我会使用:
select objid, min(begda), max(endda), HASHROW_COL,
row_number() over (partition by objid order by min(begda)) as ranking
from (select t.*,
sum(case when prev_endda = begda - interval '1' day then 0 else 1 end) over (partition by objid order by begda) as grouping
from (select t.*,
lag(endda) over (partition by objid, HASHROW_COL order by begda) as prev_endda
from t
) t
) t
group by grouping, objid, HASHROW_COL;
假设 hashrow_col 是保持不变的值,(即你得到了你需要分组的值)那么你只需要 min(begda) 和 max(endda)
select objid, min(begda) as begda, max(endda) as enddat, HASHROW_COL,
min(rnk)
from OTABLE
group by objid,hashrow_col
这是一个使用 NORMALIZE 函数和周期数据类型的解决方案。
我们必须将您的开始日期和结束日期转换为句点。正常化工作的一个陷阱是连续的结束日期和开始日期必须重叠,这就是为什么我在结束日期前增加了 1 天(除非是 9999-12-31')。
select normalize on meets or overlaps objid,hashrow_col,duration from (
select
t.*,
period(begda, case when endda = '9999-12-31' then endda else endda +interval '1' DAY end) as duration
from
<your table> t) tt
归一化和周期在适当的情况下非常强大。
我有一个地址 SCD 类型 2 table 但有时输入的信息保持不变13=]
OBJID BEGDA ENDDA HASHROW_COL RK
83022088 2012-03-30 2012-10-28 e1-ef-a9-36 1
83022088 2012-10-29 2013-09-07 63-69-e5-25 2
83022088 2013-09-08 2014-08-30 e1-ef-a9-36 3
83022088 2014-08-31 2016-11-26 e1-ef-a9-36 4
83022088 2016-11-27 9999-12-31 e1-ef-a9-36 5
请注意,第 3 – 5 行的 HASHROW_COL 保持不变。
Desired result:
OBJID BEGDA ENDDA HASHROW_COL RK
83022088 2012-03-30 2012-10-28 e1-ef-a9-36 1
83022088 2012-10-29 2013-09-07 63-69-e5-25 2
83022088 2013-09-08 9999-12-31 e1-ef-a9-36 3
目前查询
select a.objid, a.hashrow_col,
case when a.objid <> b.objid then b.begda
when a.hashrow_col = b.hashrow_col and (b.begda - interval '1' day <= a.endda) then
a.begda end,
case when a.objid <> b.objid then b.endda
when (a.hashrow_col = b.hashrow_col) and (b.begda - interval '1' day <= a.endda) and b.endda > a.endda
then b.endda
end,
from
(select objid, begda, endda, HASHROW_COL,
from OTABLE ) a
inner join
(select objid, begda, endda, HASHROW_COL,
from OTABLE) b
on
a.objid = b.objid
where
and a.objid = '83022088'
order by a.OBJID, a.BEGDA, a.HASHROW_COL;
这是一个缺口和孤岛问题。在这种情况下,我会使用:
select objid, min(begda), max(endda), HASHROW_COL,
row_number() over (partition by objid order by min(begda)) as ranking
from (select t.*,
sum(case when prev_endda = begda - interval '1' day then 0 else 1 end) over (partition by objid order by begda) as grouping
from (select t.*,
lag(endda) over (partition by objid, HASHROW_COL order by begda) as prev_endda
from t
) t
) t
group by grouping, objid, HASHROW_COL;
假设 hashrow_col 是保持不变的值,(即你得到了你需要分组的值)那么你只需要 min(begda) 和 max(endda)
select objid, min(begda) as begda, max(endda) as enddat, HASHROW_COL,
min(rnk)
from OTABLE
group by objid,hashrow_col
这是一个使用 NORMALIZE 函数和周期数据类型的解决方案。
我们必须将您的开始日期和结束日期转换为句点。正常化工作的一个陷阱是连续的结束日期和开始日期必须重叠,这就是为什么我在结束日期前增加了 1 天(除非是 9999-12-31')。
select normalize on meets or overlaps objid,hashrow_col,duration from (
select
t.*,
period(begda, case when endda = '9999-12-31' then endda else endda +interval '1' DAY end) as duration
from
<your table> t) tt
归一化和周期在适当的情况下非常强大。