间隙和孤岛 - 如何按 ID 对每组连续行求和
Gaps and Islands - How to Sum Each Group of Consecutive Rows by ID
下面是我当前的 SQL 代码和输出。我只需要获得 CD 等于 STG(以黄色突出显示)的连续(或单个)行的 EFF_DAYS 的总和。
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT) RN,
Z2.*
FROM (
SELECT CASE WHEN (LAG_CD IS NULL OR LAG_CD NOT IN ('STG')) AND CD IN ('STG')
THEN RANK() OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT)
WHEN CD = LAG_CD AND CD IN ('STG')
THEN RANK() OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT)
WHEN CD = LAG_CD AND CD != LEAD_CD
THEN RANK() OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT)
END AS CASES,
Z.* FROM (
SELECT ID,
LAG(CD) OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT) AS LAG_CD,
LEAD(CD) OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT) AS LEAD_CD,
CD,
TMSP,
EFF_DT,
END_EFF_DT,
DATEDIFF(day, EFF_DT, END_EFF_DT) AS EFF_DAYS
FROM #POSTCHG_ROWS
WHERE ID IN ('ABC123', 'XYZ789')
) Z
) Z2 ORDER BY TMSP, EFF_DT
我已经尝试了各种行号和排名,但我似乎无法使 CASES 列正确。我花了几个小时研究其他 gap-island sql 解决方案,但没有遇到下面的确切情况。
理想情况下,我的 CASES 列会像下面这样输出,这样我就可以按 CASES、ID、连续行块的起始 TMSP 进行分组,然后计算:SUM(EFF_DAYS)。
下面是我的目标输出:
您只对一系列相邻的“CTG”行感兴趣。我认为最简单的方法是 window 非“STG”值的计数来定义组,然后过滤和聚合:
select
id,
min(tmsp) tmsp,
min(eff_dt) eff_dt,
sum(datediff(day, eff_dt, end_eff_dt)) sum_eff_days
from (
select
p.*
sum(case when cd = 'STG' then 0 else 1 end)
over(partition by id order by tmsp) grp
from #postchg_rows p
) p
where cd = 'STG'
group by id, grp
下面是我当前的 SQL 代码和输出。我只需要获得 CD 等于 STG(以黄色突出显示)的连续(或单个)行的 EFF_DAYS 的总和。
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT) RN,
Z2.*
FROM (
SELECT CASE WHEN (LAG_CD IS NULL OR LAG_CD NOT IN ('STG')) AND CD IN ('STG')
THEN RANK() OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT)
WHEN CD = LAG_CD AND CD IN ('STG')
THEN RANK() OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT)
WHEN CD = LAG_CD AND CD != LEAD_CD
THEN RANK() OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT)
END AS CASES,
Z.* FROM (
SELECT ID,
LAG(CD) OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT) AS LAG_CD,
LEAD(CD) OVER (PARTITION BY ID ORDER BY TMSP, EFF_DT) AS LEAD_CD,
CD,
TMSP,
EFF_DT,
END_EFF_DT,
DATEDIFF(day, EFF_DT, END_EFF_DT) AS EFF_DAYS
FROM #POSTCHG_ROWS
WHERE ID IN ('ABC123', 'XYZ789')
) Z
) Z2 ORDER BY TMSP, EFF_DT
我已经尝试了各种行号和排名,但我似乎无法使 CASES 列正确。我花了几个小时研究其他 gap-island sql 解决方案,但没有遇到下面的确切情况。
理想情况下,我的 CASES 列会像下面这样输出,这样我就可以按 CASES、ID、连续行块的起始 TMSP 进行分组,然后计算:SUM(EFF_DAYS)。
下面是我的目标输出:
您只对一系列相邻的“CTG”行感兴趣。我认为最简单的方法是 window 非“STG”值的计数来定义组,然后过滤和聚合:
select
id,
min(tmsp) tmsp,
min(eff_dt) eff_dt,
sum(datediff(day, eff_dt, end_eff_dt)) sum_eff_days
from (
select
p.*
sum(case when cd = 'STG' then 0 else 1 end)
over(partition by id order by tmsp) grp
from #postchg_rows p
) p
where cd = 'STG'
group by id, grp