SQL: 获取同一列中行之间的日期差异
SQL: Get date difference between rows in the same column
我正在尝试创建报告,这是我的输入数据。
Stage Name Date
1 x 12/05/2019 10:00:03
1 x 12/05/2019 10:05:01
1 y 12/06/2019 12:00:07
2 x 12/06/2019 13:12:03
2 x 12/06/2019 13:23:00
1 y 12/08/2019 16:00:07
2 x 12/09/2019 09:17:59
这是我想要的输出。
Stage Name DateFrom DateTo DateDiff
1 x 12/05/2019 10:00:03 12/06/2019 12:00:07 1
1 y 12/06/2019 12:00:07 12/06/2019 13:12:03 0
2 x 12/06/2019 13:12:03 12/08/2019 16:00:07 2
1 y 12/08/2019 16:00:07 12/09/2019 09:17:59 1
我不能在阶段和名称上使用 group by 子句,因为它会将我输入的第 3 行和第 6 行分组。我尝试将 table 加入自身,但没有得到想要的结果。这在 SQL 中甚至可能吗?任何想法都会有所帮助。我正在使用 Microsoft SQL 服务器。
这是间隙和孤岛问题的变体。您想要将相邻行的组组合在一起(即具有相同的阶段和名称);但您想使用下一组的开始日期作为当前组的结束日期。
这是一种方法:
select
stage,
name,
min(date) date_from,
lead(min(date)) over(order by min(date)) date_to,
datediff(day, min(date), lead(min(date)) over(order by min(date))) date_diff
from (
select
t.*,
row_number() over(order by date) rn1,
row_number() over(partition by stage, name order by date) rn2
from mytable t
) t
group by stage, name, rn1 - rn2
order by date_from
stage | name | date_from | date_to | datediff
----: | :--- | :------------------ | :------------------ | -------:
1 | x | 12/05/2019 10:00:03 | 12/06/2019 12:00:07 | 1
1 | y | 12/06/2019 12:00:07 | 12/06/2019 13:12:03 | 0
2 | x | 12/06/2019 13:12:03 | 12/08/2019 16:00:07 | 2
1 | y | 12/08/2019 16:00:07 | 12/09/2019 09:17:59 | 1
2 | x | 12/09/2019 09:17:59 | null | null
请注意,这不会完全您显示的结果:在结果集的末尾有一个额外的待处理记录,代表 "on-going"系列记录。如果需要,您可以通过嵌套查询将其过滤掉:
select *
from (
select
stage,
name,
min(date) date_from,
lead(min(date)) over(order by min(date)) date_to,
datediff(day, min(date), lead(min(date)) over(order by min(date))) date_diff
from (
select
t.*,
row_number() over(order by date) rn1,
row_number() over(partition by stage, name order by date) rn2
from mytable t
) t
group by stage, name, rn1 - rn2
) t
where date_to is not null
order by date_from
这个是间隙和孤岛问题的变体,但它有一个非常简单的解决方案。
只需保留 上一 行具有不同阶段或名称的每一行。然后使用 lead()
获取下一个日期。这是基本思想:
select t.stage, t.name, t.date as datefrom
lead(t.date) over (order by t.date) as dateto,
datediff(day, t.date, lead(t.date) over (order by t.date)) as diff
from (select t.*,
lag(date) over (partition by stage, name order by date) as prev_sn_date,
lag(date) over (order by date) as prev_date
from t
) t
where prev_sn_date <> prev_date or prev_sn_date is null;
如果真的要过滤掉最后一行,还需要多一步;我不确定这是否可取。
我正在尝试创建报告,这是我的输入数据。
Stage Name Date
1 x 12/05/2019 10:00:03
1 x 12/05/2019 10:05:01
1 y 12/06/2019 12:00:07
2 x 12/06/2019 13:12:03
2 x 12/06/2019 13:23:00
1 y 12/08/2019 16:00:07
2 x 12/09/2019 09:17:59
这是我想要的输出。
Stage Name DateFrom DateTo DateDiff
1 x 12/05/2019 10:00:03 12/06/2019 12:00:07 1
1 y 12/06/2019 12:00:07 12/06/2019 13:12:03 0
2 x 12/06/2019 13:12:03 12/08/2019 16:00:07 2
1 y 12/08/2019 16:00:07 12/09/2019 09:17:59 1
我不能在阶段和名称上使用 group by 子句,因为它会将我输入的第 3 行和第 6 行分组。我尝试将 table 加入自身,但没有得到想要的结果。这在 SQL 中甚至可能吗?任何想法都会有所帮助。我正在使用 Microsoft SQL 服务器。
这是间隙和孤岛问题的变体。您想要将相邻行的组组合在一起(即具有相同的阶段和名称);但您想使用下一组的开始日期作为当前组的结束日期。
这是一种方法:
select
stage,
name,
min(date) date_from,
lead(min(date)) over(order by min(date)) date_to,
datediff(day, min(date), lead(min(date)) over(order by min(date))) date_diff
from (
select
t.*,
row_number() over(order by date) rn1,
row_number() over(partition by stage, name order by date) rn2
from mytable t
) t
group by stage, name, rn1 - rn2
order by date_from
stage | name | date_from | date_to | datediff ----: | :--- | :------------------ | :------------------ | -------: 1 | x | 12/05/2019 10:00:03 | 12/06/2019 12:00:07 | 1 1 | y | 12/06/2019 12:00:07 | 12/06/2019 13:12:03 | 0 2 | x | 12/06/2019 13:12:03 | 12/08/2019 16:00:07 | 2 1 | y | 12/08/2019 16:00:07 | 12/09/2019 09:17:59 | 1 2 | x | 12/09/2019 09:17:59 | null | null
请注意,这不会完全您显示的结果:在结果集的末尾有一个额外的待处理记录,代表 "on-going"系列记录。如果需要,您可以通过嵌套查询将其过滤掉:
select *
from (
select
stage,
name,
min(date) date_from,
lead(min(date)) over(order by min(date)) date_to,
datediff(day, min(date), lead(min(date)) over(order by min(date))) date_diff
from (
select
t.*,
row_number() over(order by date) rn1,
row_number() over(partition by stage, name order by date) rn2
from mytable t
) t
group by stage, name, rn1 - rn2
) t
where date_to is not null
order by date_from
这个是间隙和孤岛问题的变体,但它有一个非常简单的解决方案。
只需保留 上一 行具有不同阶段或名称的每一行。然后使用 lead()
获取下一个日期。这是基本思想:
select t.stage, t.name, t.date as datefrom
lead(t.date) over (order by t.date) as dateto,
datediff(day, t.date, lead(t.date) over (order by t.date)) as diff
from (select t.*,
lag(date) over (partition by stage, name order by date) as prev_sn_date,
lag(date) over (order by date) as prev_date
from t
) t
where prev_sn_date <> prev_date or prev_sn_date is null;
如果真的要过滤掉最后一行,还需要多一步;我不确定这是否可取。