时间序列的第一天和最后一天取决于虚拟变量 SQL

First day and last day of times series depending on a dummy variable SQL

我想根据虚拟变量 SQL 获取时间序列的第一天和最后一天。 我有一个项目,每天我都有一个虚拟变量来指示该项目是否在线。 例如,查看以下特定日期的数据:

Item    Day         Online
car     01/02/2020  1
car     02/02/2020  1
car     03/02/2020  0
car     04/02/2020  1
car     05/02/2020  1
van     01/02/2020  0
van     02/02/2020  1
van     03/02/2020  0
van     04/02/2020  1
van     05/02/2020  0
bike    01/02/2020  0
bike    02/02/2020  0
bike    03/02/2020  0
bike    04/02/2020  0
bike    05/02/2020  0

我需要类似于以下结果的结果:

car:
first day : 01/02/2020
last day : 02/02/2020
first day and last day : 03/02/2020
first day : 04/02/2020
last day : 05/02/2020

van:
first day and last day : 01/02/2020
first day and last day : 02/02/2020
first day and last day : 03/02/2020
first day and last day : 04/02/2020
first day and last day : 05/02/2020

bike:
first day : 01/02/2020
first day and last day : 05/02/2020

我完全迷路了,如果有人能启发我,我将不胜感激。 谢谢!

您似乎想为每个项目划分在线和离线时间段。这是一个缺口和孤岛问题。一个通用的方法是:

select item, online, min(day), max(day)
from (select t.*,
             row_number() over (partition by item order by day) as seqnum,
             row_number() over (partition by item, online order by day) as seqnum_2
      from t
     ) t
group by item, online, (seqnum - seqnum_2);

使用行号的要点是定义 online 标志相同的组。在值相同的相邻行上,行号的差异是恒定的。

在您的特定情况下,online 只有两种状态。您可以改为使用状态的累计总和:

select item, online, min(day), max(day)
from (select t.*,
             sum(online) over (partition by item order by day) as grp
      from t
     ) t
group by item, online, grp;