使用第一行和最后一行压缩多个连续行
Condense multiple consecutive rows using first and last row
我正在尝试找到一种方法将连续的相似记录压缩成一行,例如:
Status starttime endtime
State1 2020-11-01 13:00:29.000 2020-11-01 13:03:59.000
State1 2020-11-01 13:03:59.000 2020-11-01 13:04:01.000
State1 2020-11-01 13:04:01.000 2020-11-01 13:05:27.000
State1 2020-11-01 13:05:27.000 2020-11-01 13:05:29.000
State2 2020-11-01 13:05:29.000 2020-11-01 13:11:31.000
State2 2020-11-01 16:19:35.000 2020-11-01 16:19:55.000
会浓缩成
Status starttime endtime
State1 2020-11-01 13:00:29.000 2020-11-01 13:05:29.000
State2 2020-11-01 13:05:29.000 2020-11-01 13:11:31.000
State2 2020-11-01 16:19:35.000 2020-11-01 16:19:55.000
在这种情况下,前4行被压缩了,因为它们是相同的状态,并且是连续的时间。最后2行没有压缩,因为它们之间有时间间隔。
这可能吗?
这是一个间隙和孤岛问题,您希望将具有相同状态和相邻周期的连续行组合在一起。
您可以使用window个函数;这个想法是用 window 总和来定义组,只要状态发生变化或周期中断,总和就会增加:
select min(status) as status, min(starttime) as starttime, max(endtime) as endtime
from (
select t.*,
sum(case when starttime = lag_endtime and status = lag_status then 0 else 1 end) over(order by starttime) as grp
from (
select t.*,
lag(endtime) over(order by starttime) lag_endtime,
lag(status) over(order by starttime) lag_status
from mytable t
) t
) t
group by grp
status | starttime | endtime
:----- | :---------------------- | :----------------------
State1 | 2020-11-01 13:00:29.000 | 2020-11-01 13:05:29.000
State2 | 2020-11-01 13:05:29.000 | 2020-11-01 13:11:31.000
State2 | 2020-11-01 16:19:35.000 | 2020-11-01 16:19:55.000
我正在尝试找到一种方法将连续的相似记录压缩成一行,例如:
Status starttime endtime
State1 2020-11-01 13:00:29.000 2020-11-01 13:03:59.000
State1 2020-11-01 13:03:59.000 2020-11-01 13:04:01.000
State1 2020-11-01 13:04:01.000 2020-11-01 13:05:27.000
State1 2020-11-01 13:05:27.000 2020-11-01 13:05:29.000
State2 2020-11-01 13:05:29.000 2020-11-01 13:11:31.000
State2 2020-11-01 16:19:35.000 2020-11-01 16:19:55.000
会浓缩成
Status starttime endtime
State1 2020-11-01 13:00:29.000 2020-11-01 13:05:29.000
State2 2020-11-01 13:05:29.000 2020-11-01 13:11:31.000
State2 2020-11-01 16:19:35.000 2020-11-01 16:19:55.000
在这种情况下,前4行被压缩了,因为它们是相同的状态,并且是连续的时间。最后2行没有压缩,因为它们之间有时间间隔。
这可能吗?
这是一个间隙和孤岛问题,您希望将具有相同状态和相邻周期的连续行组合在一起。
您可以使用window个函数;这个想法是用 window 总和来定义组,只要状态发生变化或周期中断,总和就会增加:
select min(status) as status, min(starttime) as starttime, max(endtime) as endtime
from (
select t.*,
sum(case when starttime = lag_endtime and status = lag_status then 0 else 1 end) over(order by starttime) as grp
from (
select t.*,
lag(endtime) over(order by starttime) lag_endtime,
lag(status) over(order by starttime) lag_status
from mytable t
) t
) t
group by grp
status | starttime | endtime :----- | :---------------------- | :---------------------- State1 | 2020-11-01 13:00:29.000 | 2020-11-01 13:05:29.000 State2 | 2020-11-01 13:05:29.000 | 2020-11-01 13:11:31.000 State2 | 2020-11-01 16:19:35.000 | 2020-11-01 16:19:55.000