日期连续时的滚动总和
Rolling Sum when date is continuous
我试图找出人们在 SQL 中连续工作了多少天。我认为滚动总和可能是解决方案,但不知道如何解决。
我的样本数据是
| Employee | work_period |
| 1 | 2019-01-01 |
| 1 | 2019-01-02 |
| 1 | 2019-01-03 |
| 1 | 2019-01-04 |
| 1 | 2019-01-05 |
| 1 | 2019-01-10 |
| 1 | 2019-01-11 |
| 1 | 2019-01-12 |
| 2 | 2019-01-20 |
| 2 | 2019-01-22 |
| 2 | 2019-01-23 |
| 2 | 2019-01-24 |
指定的结果应该是
| Employee | work_period | Continuous Days |
| 1 | 2019-01-01 | 1 |
| 1 | 2019-01-02 | 2 |
| 1 | 2019-01-03 | 3 |
| 1 | 2019-01-04 | 4 |
| 1 | 2019-01-05 | 5 |
| 1 | 2019-01-10 | 1 |
| 1 | 2019-01-11 | 2 |
| 1 | 2019-01-12 | 3 |
| 2 | 2019-01-20 | 1 |
| 2 | 2019-01-22 | 1 |
| 2 | 2019-01-23 | 2 |
| 2 | 2019-01-24 | 3 |
如果天数不连续,则从1重新开始连续计算。
您可以先使用 lag()
来检查每个员工的前一行(按 work_period
排序)是否恰好与当前行有天数。如果条件为真,则在 CASE
表达式中使用 returns 0
,否则 0
。然后使用 sum()
的窗口版本按 work_period
的顺序汇总每个员工的 0
和 1
。这为您提供了每个员工每组连续天数。然后,您可以使用此组号 PARTITION BY
在 sum()
的窗口版本中为用户添加 1
为按 work_period
.[=25 排序的分区中的每一行添加 1
=]
SELECT employee,
work_period,
sum(1) OVER (PARTITION BY employee,
g
ORDER BY work_period) continuous_days
FROM (SELECT employee,
work_period,
sum(c) OVER (PARTITION BY employee
ORDER BY work_period) g
FROM (SELECT employee,
work_period,
CASE
WHEN lag(work_period) OVER (PARTITION BY employee
ORDER BY work_period) = dateadd(day, -1, work_period) THEN
0
ELSE
1
END c
FROM elbat) x) y;
只是另一种选择...与 Gaps-and-Ilands 非常相似,但没有最终聚合。
例子
Select Employee
,work_period
,Cont_Days = row_number() over (partition by Employee,Grp Order by Work_Period)
From (
Select *
,Grp = datediff(day,'1900-01-01',work_period) - row_number() over (partition by Employee Order by Work_Period)
From YourTable
) A
Returns
Employee work_period Cont_Days
1 2019-01-01 1
1 2019-01-02 2
1 2019-01-03 3
1 2019-01-04 4
1 2019-01-05 5
1 2019-01-10 1
1 2019-01-11 2
1 2019-01-12 3
2 2019-01-20 1
2 2019-01-22 1
2 2019-01-23 2
2 2019-01-24 3
这与 John 的回答类似,但更简单一些。
您可以通过减去一系列数字来识别相邻行的组 - 差异是恒定的。所以:
select Employee, work_period,
row_number9) over (partition by employee, grp order by work_period) as day_counter
,Cont_Days = row_number() over (partition by Employee,Grp Order by Work_Period)
from (select t.*,
dateadd(day,
- row_number() over (partition by employee order by work_period),
work_period
) as grp
from t
) t;
另一个有趣的方法是识别 "islands" 开始的行,然后使用 datediff()
:
select t.*,
datediff(day,
max(case when island_start_flag = 1 then workperiod end) over (partition by employee order by workperiod),
workperiod
) + 1 as days_counter
from (select t.*,
(case when lag(workperiod) over (partition by employee order by workperiod) >= dateadd(day, -1, workperiod)
then 0 else 1
end) as island_start_flag
from t
) t;
我试图找出人们在 SQL 中连续工作了多少天。我认为滚动总和可能是解决方案,但不知道如何解决。
我的样本数据是
| Employee | work_period |
| 1 | 2019-01-01 |
| 1 | 2019-01-02 |
| 1 | 2019-01-03 |
| 1 | 2019-01-04 |
| 1 | 2019-01-05 |
| 1 | 2019-01-10 |
| 1 | 2019-01-11 |
| 1 | 2019-01-12 |
| 2 | 2019-01-20 |
| 2 | 2019-01-22 |
| 2 | 2019-01-23 |
| 2 | 2019-01-24 |
指定的结果应该是
| Employee | work_period | Continuous Days |
| 1 | 2019-01-01 | 1 |
| 1 | 2019-01-02 | 2 |
| 1 | 2019-01-03 | 3 |
| 1 | 2019-01-04 | 4 |
| 1 | 2019-01-05 | 5 |
| 1 | 2019-01-10 | 1 |
| 1 | 2019-01-11 | 2 |
| 1 | 2019-01-12 | 3 |
| 2 | 2019-01-20 | 1 |
| 2 | 2019-01-22 | 1 |
| 2 | 2019-01-23 | 2 |
| 2 | 2019-01-24 | 3 |
如果天数不连续,则从1重新开始连续计算。
您可以先使用 lag()
来检查每个员工的前一行(按 work_period
排序)是否恰好与当前行有天数。如果条件为真,则在 CASE
表达式中使用 returns 0
,否则 0
。然后使用 sum()
的窗口版本按 work_period
的顺序汇总每个员工的 0
和 1
。这为您提供了每个员工每组连续天数。然后,您可以使用此组号 PARTITION BY
在 sum()
的窗口版本中为用户添加 1
为按 work_period
.[=25 排序的分区中的每一行添加 1
=]
SELECT employee,
work_period,
sum(1) OVER (PARTITION BY employee,
g
ORDER BY work_period) continuous_days
FROM (SELECT employee,
work_period,
sum(c) OVER (PARTITION BY employee
ORDER BY work_period) g
FROM (SELECT employee,
work_period,
CASE
WHEN lag(work_period) OVER (PARTITION BY employee
ORDER BY work_period) = dateadd(day, -1, work_period) THEN
0
ELSE
1
END c
FROM elbat) x) y;
只是另一种选择...与 Gaps-and-Ilands 非常相似,但没有最终聚合。
例子
Select Employee
,work_period
,Cont_Days = row_number() over (partition by Employee,Grp Order by Work_Period)
From (
Select *
,Grp = datediff(day,'1900-01-01',work_period) - row_number() over (partition by Employee Order by Work_Period)
From YourTable
) A
Returns
Employee work_period Cont_Days
1 2019-01-01 1
1 2019-01-02 2
1 2019-01-03 3
1 2019-01-04 4
1 2019-01-05 5
1 2019-01-10 1
1 2019-01-11 2
1 2019-01-12 3
2 2019-01-20 1
2 2019-01-22 1
2 2019-01-23 2
2 2019-01-24 3
这与 John 的回答类似,但更简单一些。
您可以通过减去一系列数字来识别相邻行的组 - 差异是恒定的。所以:
select Employee, work_period,
row_number9) over (partition by employee, grp order by work_period) as day_counter
,Cont_Days = row_number() over (partition by Employee,Grp Order by Work_Period)
from (select t.*,
dateadd(day,
- row_number() over (partition by employee order by work_period),
work_period
) as grp
from t
) t;
另一个有趣的方法是识别 "islands" 开始的行,然后使用 datediff()
:
select t.*,
datediff(day,
max(case when island_start_flag = 1 then workperiod end) over (partition by employee order by workperiod),
workperiod
) + 1 as days_counter
from (select t.*,
(case when lag(workperiod) over (partition by employee order by workperiod) >= dateadd(day, -1, workperiod)
then 0 else 1
end) as island_start_flag
from t
) t;