连续日期的组岛,包括缺少周末

Group islands of contiguous dates, including missing weekends

我有一个大型数据集,其中包含某些操作的日期,我正在尝试计算连续的日期。四处搜索,我发现了这个:https://www.sqlservercentral.com/articles/group-islands-of-contiguous-dates-sql-spackle,它近乎完美,它正在做我正在寻找的事情。不幸的是,由于我的数据集,我有一个需要查询执行的异常业务规则:如果员工的最后日期是星期五,而下一个开始日期是最近的星期一,它应该将这些日期分组到同一个“岛”在不增加天数的情况下。这就是我对示例数据集的意思:

CREATE TABLE Actions    
   ([Employee] varchar(2), [ActionDate] date)    
;       

INSERT INTO Actions    
    ([Employee], [ActionDate])

VALUES    
    ('AA', '2019-01-03'),    
    ('AA', '2019-01-04'),    
    ('AA', '2019-01-07'),    
    ('AA', '2019-01-08'),    
    ('BB', '2019-08-01'),    
    ('BB', '2019-08-02'),    
    ('BB', '2019-08-03'),    
    ('BB', '2019-08-04'),    
    ('BB', '2019-08-05'),    
    ('BB', '2019-08-06'),    
    ('CC', '2019-09-09'),    
    ('CC', '2019-09-10'),    
    ('CC', '2019-09-11'),    
    ('CC', '2019-09-12'),    
    ('CC', '2019-09-13'),    
    ('CC', '2019-09-16'),    
    ('CC', '2019-09-17'),    
    ('CC', '2019-09-18')    
;

我找到的查询更改了列以匹配示例:

WITH    
days As    
(    
SELECT Employee,    
       ActionDate,    
       DATEADD(dd, -ROW_NUMBER() OVER  (PARTITION BY Employee ORDER BY Employee, ActionDate), ActionDate) As grouping    
FROM Actions    
GROUP BY Employee, ActionDate    
)    
SELECT Employee,    
       MIN(ActionDate) AS ActionStart,    
       MAX(ActionDate) As ActionEnd,    
       DATEDIFF(dd,MIN(ActionDate),MAX(ActionDate))+1 As ActLength    
FROM days    
GROUP BY Employee, grouping    
ORDER BY Employee, ActionStart

结果是:

+-------+----------+-------------+------------+-----------+
| RowNr | Employee | ActionStart | ActionEnd  | ActLength |
+-------+----------+-------------+------------+-----------+
|     1 | AA       |  03.01.2019 | 04.01.2019 |         2 |
|     2 | AA       |  07.01.2019 | 08.01.2019 |         2 |
|     3 | BB       |  01.08.2019 | 06.08.2019 |         6 |
|     4 | CC       |  09.09.2019 | 13.09.2019 |         5 |
|     5 | CC       |  16.09.2019 | 18.09.2019 |         3 |
+-------+----------+-------------+------------+-----------+

在此示例中,员工 AA 的结束日期为 4.1.2019 星期五,7.1.2019 的开始日期是最近的星期一。 CC 还有一个结束日期是 2019 年 9 月 13 日星期五,下一个开始日期是最近的 2019 年 9 月 16 日星期一。它应该在不增加 ActLength 的情况下“合并”这些日期。所以期望的结果是:

+-------+----------+-------------+------------+-----------+
| RowNr | Employee | ActionStart | ActionEnd  | ActLength |
+-------+----------+-------------+------------+-----------+
|     1 | AA       |  03.01.2019 | 08.01.2019 |         4 |
|     2 | BB       |  01.08.2019 | 06.08.2019 |         6 |
|     3 | CC       |  09.09.2019 | 18.09.2019 |         8 |
+-------+----------+-------------+------------+-----------+

有谁知道可以为这种 SQL 查询创建这样的规则吗?我试着环顾四周,通常人们想排除周末。非常感谢大家。

我发现使用 lag() 和 window 总和来实现您想要的逻辑更容易:

select employee, min(actionDate) actionStart, max(actionDate) actionEnd, count(*) actionLength
from (
    select 
        a.*, sum(
            case when actionDate = dateadd(day, 1, lagActionDate) 
                or (actionDate = dateadd(day, 3, lagActionDate) and datename(weekday, actionDate) = 'Monday')
            then 0 else 1 end
        ) over(partition by employee order by actionDate) grp
    from (
        select 
            a.*, 
            lag(actionDate) over(partition by employee order by actionDate) lagActionDate
        from actions a
    ) a
) a
group by employee, grp

Demo on DB Fiddle:

employee | actionStart | actionEnd  | actionLength
:------- | :---------- | :--------- | -----------:
AA       | 2019-01-03  | 2019-01-08 |            4
BB       | 2019-08-01  | 2019-08-06 |            6
CC       | 2019-09-09  | 2019-09-18 |            8