合并开始日期和结束日期周期日期
merging start and end date cycle dates
我有以下 table。
CUST_ID START_CYCLE END_CYCLE WORKER CUST_SUB_ID CUST_SUB_TYPE
101 1/1/2019 1/31/2019 ABC123 134 HIGH_SUB
101 2/1/2019 4/30/2019 ABC123 136 HIGH_SUB
101 5/1/2019 7/31/2019 ABC123 1414 HIGH_SUB
101 8/1/2019 8/30/2019 ABC123 1469 HIGH_SUB
发生以下情况时,我需要合并日期(排除行)。
条件:
lead
cust_sub_id <>
cust_sub_id
lead
cust_sub_type = cust_sub_type
lead
cust_id = cust_id
lead
工人=工人
最终输出:
CUST_ID START_CYCLE END_CYCLE WORKER CUST_SUB_ID CUST_SUB_TYPE
101 1/1/2019 8/30/2019 ABC123 134:136:1414:1469 HIGH_SUB
SQL:
select
cust_id,
start_cycle,
end_cycle,
worker,
cust_sub_id,
cust_sub_type,
gaps_in_srv,
case
when gaps_in_srv = 'N'
and cust_sub_id <> lead(cust_sub_id) over (partition by cust_id order by start_cycle)
and cust_id = lead(cust_id) over (partition by cust_id order by start_cycle)
and worker = lead(worker) over (partition by cust_id order by start_cycle)
then 'Y'
else 'N'
end merge_dts,
dense_rank() over (partition by cust_id, cust_sub_id order by start_cycle) rnk_fst_dt,
dense_rank() over (partition by cust_id, cust_sub_id order by start_cycle desc) rnk_lst_dt
from cust_cycle
删除行后,我可以使用 listagg
捕获 cust_sub_id,但在应用 listagg 函数之前需要帮助删除这些行。
您可以将此视为间隙和孤岛问题,使用 lag()
确定 "island" 从何处开始,并使用累积和为孤岛分配分组编号:
select cust_id, worker, cust_sub_type, min(start_cycle), max(end_cycle),
listagg(cust_sub_id, ':') within group (order by start_cycle) as cust_sub_ids
from (select cc.*,
sum(case when prev_ec <> start_cycle - interval '1' day or
prev_cst <> cust_sub_type
then 1 else 0
end) over (partition by cust_id, worker order by start_cycle) as grp
from (select cc.*,
lag(end_cycle) over (partition by cust_id, worker order by start_cycle) as prev_ec,
lag(cust_sub_type) over (partition by cust_id, worker order by start_cycle) as prev_cst
from cust_cycle cc
) cc
) cc
group by cust_id, worker, cust_sub_type, worker
我有以下 table。
CUST_ID START_CYCLE END_CYCLE WORKER CUST_SUB_ID CUST_SUB_TYPE
101 1/1/2019 1/31/2019 ABC123 134 HIGH_SUB
101 2/1/2019 4/30/2019 ABC123 136 HIGH_SUB
101 5/1/2019 7/31/2019 ABC123 1414 HIGH_SUB
101 8/1/2019 8/30/2019 ABC123 1469 HIGH_SUB
发生以下情况时,我需要合并日期(排除行)。
条件:
lead
cust_sub_id <>
cust_sub_id
lead
cust_sub_type = cust_sub_type
lead
cust_id = cust_id
lead
工人=工人
最终输出:
CUST_ID START_CYCLE END_CYCLE WORKER CUST_SUB_ID CUST_SUB_TYPE
101 1/1/2019 8/30/2019 ABC123 134:136:1414:1469 HIGH_SUB
SQL:
select
cust_id,
start_cycle,
end_cycle,
worker,
cust_sub_id,
cust_sub_type,
gaps_in_srv,
case
when gaps_in_srv = 'N'
and cust_sub_id <> lead(cust_sub_id) over (partition by cust_id order by start_cycle)
and cust_id = lead(cust_id) over (partition by cust_id order by start_cycle)
and worker = lead(worker) over (partition by cust_id order by start_cycle)
then 'Y'
else 'N'
end merge_dts,
dense_rank() over (partition by cust_id, cust_sub_id order by start_cycle) rnk_fst_dt,
dense_rank() over (partition by cust_id, cust_sub_id order by start_cycle desc) rnk_lst_dt
from cust_cycle
删除行后,我可以使用 listagg
捕获 cust_sub_id,但在应用 listagg 函数之前需要帮助删除这些行。
您可以将此视为间隙和孤岛问题,使用 lag()
确定 "island" 从何处开始,并使用累积和为孤岛分配分组编号:
select cust_id, worker, cust_sub_type, min(start_cycle), max(end_cycle),
listagg(cust_sub_id, ':') within group (order by start_cycle) as cust_sub_ids
from (select cc.*,
sum(case when prev_ec <> start_cycle - interval '1' day or
prev_cst <> cust_sub_type
then 1 else 0
end) over (partition by cust_id, worker order by start_cycle) as grp
from (select cc.*,
lag(end_cycle) over (partition by cust_id, worker order by start_cycle) as prev_ec,
lag(cust_sub_type) over (partition by cust_id, worker order by start_cycle) as prev_cst
from cust_cycle cc
) cc
) cc
group by cust_id, worker, cust_sub_type, worker