如何确定 SQL 服务器 "finding islands" 中的连续日期 count/days(连续行)

How to Determine consecutive date count/days in SQL Server "finding islands" (consecutive rows)

我有一个学生 table,我想知道他们 session/training 持续了多长时间。 我想排除周末,但我想计算不包括周末的连续天数。 A class 有 Start Date 和 End Date,例如,学生 ID S1 a 可以在 1 月预订 class 然后在 2 月再次预订,我想知道 1 月预订和 2 月预订有多少天,同时排除周末。基本上,我正在通过学​​生 ID 寻找从开始日期到结束日期的连续日期,除了周末没有休息时间。

SELECT 
 [ID]
,[StartDate]
,[EndDate]
,[BookingDays] AS Consecutive_Booking
FROM StudentBooking

如果学生 (student classifications(Type)) 在过去 3 天内预订了 class 5 天或 2 次(开始日期到结束日期(周一至周五))个月他们是 Resident else Visitors。开始日期和结束日期仅记录为周一至周五。 请注意学生 ID 1 有一个连续的日期,这应该被算作一个块。 (02/01/2018-12/01/2018) 第二块 22/01-26/01

我想在下面复制 table。

ID   StartDate  EndDate     Duration     Type
1   02/01/2018  05/01/2018              ==>Please Note have continous dates
1   08/01/2018  12/01/2018   9           Resident
1   22/01/2018  26/01/2018   5           Resident 
2   23/01/2018  26/01/2018   4           Visitor
3   29/01/2018  31/01/2018   3           Visitor

这是我对你的问题的解决方案。

在 CTE "comparison" 中,我将每条记录与该学生的这条记录和所有后续记录相结合。这样,我就有了一个连续训练块的可能起点(从连接的左侧)和这样一个块的可能结束点(从连接的右侧)。 使用 "cross applies",我计算了 2 个值:

  1. 从链的第一个可能间隔开始到最后一个可能间隔结束的工作日
  2. 链中可能的最后一个间隔中的工作日。

根据后面的值,使用 windows 函数,我根据可能的开始和结束间隔构建了 运行 工作日总数。 您用 "SQL 2012" 标记了问题,因此应该可以使用此 window 函数。

在下一个 CTE ("sorting") 中,我将之前的结果限制为 运行 总数等于第一个开始日期和最后一个结束日期之间的工作日。这样,只剩下连续的块。然后以两种方式对它们进行编号:

  1. 具有相同 EndDate 的连续块按 StartDate 升序编号
  2. 具有相同 StartDate 的连续块按 EndDate 降序编号。

对于每个 EndDate,我想要最早的 StartDate,而对于这个 StartDate,我只想要最新的 EndDate,所以我在两个编号中都筛选为 1。在这里:

WITH
  comparison (ID, StartDate, EndDate, TotalDays, SumSingleDays) AS (
    SELECT bStart.ID, bStart.StartDate, bEnd.EndDate, Workdays.Total
      , SUM(Workdays.Single) OVER (
          PARTITION BY bStart.ID, bStart.StartDate 
          ORDER BY bEnd.StartDate
          ROWS UNBOUNDED PRECEDING)
    FROM StudentBookings bStart
      INNER JOIN StudentBookings bEnd 
        ON bStart.ID = bEnd.ID AND bStart.StartDate <= bEnd.StartDate
      CROSS APPLY (VALUES (
        DATEDIFF(day, 0, bStart.StartDate), 
        DATEDIFF(day, 0, bEnd.StartDate), 
        1+DATEDIFF(day, 0, bEnd.EndDate))
      ) d (s1, s2, e2)
      CROSS APPLY (VALUES (
        (d.e2 - d.s1) - (d.e2/7 - d.s1/7) - ((d.e2+1)/7 - (d.s1+1)/7),
        (d.e2 - d.s2) - (d.e2/7 - d.s2/7) - ((d.e2+1)/7 - (d.s2+1)/7))
      ) Workdays (Total, Single)
  ),
  sorting (ID, StartDate, EndDate, Duration, RowNumStart, RowNumEnd) AS (
    SELECT ID, StartDate, EndDate, TotalDays
      , ROW_NUMBER() OVER (PARTITION BY ID, EndDate ORDER BY StartDate)
      , ROW_NUMBER() OVER (PARTITION BY ID, StartDate ORDER BY EndDate DESC)
    FROM comparison
    WHERE TotalDays = SumSingleDays
  )
SELECT ID, StartDate, EndDate, Duration
  , CASE WHEN Duration >= 5 THEN 'Resident' ELSE 'Visitor' END AS [Type]
FROM sorting 
WHERE (RowNumStart = 1) 
  AND (RowNumEnd = 1)
ORDER BY ID, StartDate;

结果:

也许有一个更优雅的方法来解决这个问题,使用 Itzik Ben-Gan 的间隔打包解决方案,我会 post 当我想出来的时候。

已添加:

此外,我计算了所有预订块的预订数量,并按学生 (ID) 求和,最终做出 "Resident" 决定。第一个 CTE(比较)的预订仅限于最近 3 个月:

WITH
  comparison (ID, StartDate, EndDate, TotalDays, CountBookings, SumSingleDays) AS (
    SELECT bStart.ID, bStart.StartDate, bEnd.EndDate, Workdays.Total
      , COUNT(Workdays.Single) OVER (
          PARTITION BY bStart.ID, bStart.StartDate 
          ORDER BY bEnd.StartDate
          ROWS UNBOUNDED PRECEDING)
      , SUM(Workdays.Single) OVER (
          PARTITION BY bStart.ID, bStart.StartDate 
          ORDER BY bEnd.StartDate
          ROWS UNBOUNDED PRECEDING)
    FROM StudentBookings bStart
      INNER JOIN StudentBookings bEnd 
        ON bStart.ID = bEnd.ID AND bStart.StartDate <= bEnd.StartDate
      CROSS APPLY (VALUES (
        DATEDIFF(day, 0, bStart.StartDate), 
        DATEDIFF(day, 0, bEnd.StartDate), 
        1+DATEDIFF(day, 0, bEnd.EndDate))
      ) d (s1, s2, e2)
      CROSS APPLY (VALUES (
        (d.e2 - d.s1) - (d.e2/7 - d.s1/7) - ((d.e2+1)/7 - (d.s1+1)/7),
        (d.e2 - d.s2) - (d.e2/7 - d.s2/7) - ((d.e2+1)/7 - (d.s2+1)/7))
      ) Workdays (Total, Single)
    WHERE bStart.StartDate >= DATEADD(month, -3, GETDATE())
  ),
  sorting (ID, StartDate, EndDate, Duration, CountBookings, RowNumStart, RowNumEnd) AS (
    SELECT ID, StartDate, EndDate, TotalDays, CountBookings
      , ROW_NUMBER() OVER (PARTITION BY ID, EndDate ORDER BY StartDate)
      , ROW_NUMBER() OVER (PARTITION BY ID, StartDate ORDER BY EndDate DESC)
    FROM comparison
    WHERE TotalDays = SumSingleDays
  ),
 counting (ID, StartDate, EndDate, Duration, Bookings) AS (
  SELECT ID, StartDate, EndDate, Duration
    , SUM(CountBookings) OVER (PARTITION BY ID)
  FROM sorting WHERE (RowNumStart = 1) AND (RowNumEnd = 1)
)
SELECT ID, StartDate, EndDate, Duration, Bookings
  , CASE 
      WHEN Duration >= 5 OR Bookings >= 2 THEN 'Resident' ELSE 'Visitor'
    END AS [Type]
FROM counting
ORDER BY ID, StartDate;

过滤Classe参考:

将从 bStart table 引用中获取并过滤 Class 引用。为了能够将此字段添加到最终查询中,还必须使用它来加入 bEndtable 参考,因此只有具有相同 Class参考值的预订间隔才会被连接块:

WITH
  comparison (ID, ClassReference, StartDate, EndDate, TotalDays, CountBookings, SumSingleDays) AS (
    SELECT bStart.ID, bStart.ClassReference, bStart.StartDate, bEnd.EndDate, Workdays.Total
      , COUNT(Workdays.Single) OVER (
          PARTITION BY bStart.ID, bStart.StartDate 
          ORDER BY bEnd.StartDate
          ROWS UNBOUNDED PRECEDING)
      , SUM(Workdays.Single) OVER (
          PARTITION BY bStart.ID, bStart.StartDate 
          ORDER BY bEnd.StartDate
          ROWS UNBOUNDED PRECEDING)
    FROM StudentBookings bStart
      INNER JOIN StudentBookings bEnd 
        ON bStart.ID = bEnd.ID AND bStart.StartDate <= bEnd.StartDate
       AND bStart.ClassReference = bEnd.ClassReference
      CROSS APPLY (VALUES (
        DATEDIFF(day, 0, bStart.StartDate), 
        DATEDIFF(day, 0, bEnd.StartDate), 
        1+DATEDIFF(day, 0, bEnd.EndDate))
      ) d (s1, s2, e2)
      CROSS APPLY (VALUES (
        (d.e2 - d.s1) - (d.e2/7 - d.s1/7) - ((d.e2+1)/7 - (d.s1+1)/7),
        (d.e2 - d.s2) - (d.e2/7 - d.s2/7) - ((d.e2+1)/7 - (d.s2+1)/7))
      ) Workdays (Total, Single)
    WHERE bStart.StartDate >= DATEADD(month, -3, GETDATE())
      AND bStart.ClassReference IN (N'C1', N'C2')
  ),
  sorting (ID, ClassReference, StartDate, EndDate, Duration, CountBookings, RowNumStart, RowNumEnd) AS (
    SELECT ID, ClassReference, StartDate, EndDate, TotalDays, CountBookings
      , ROW_NUMBER() OVER (PARTITION BY ID, ClassReference, EndDate ORDER BY StartDate)
      , ROW_NUMBER() OVER (PARTITION BY ID, ClassReference, StartDate ORDER BY EndDate DESC)
    FROM comparison
    WHERE TotalDays = SumSingleDays
  ),
  counting (ID, ClassReference, StartDate, EndDate, Duration, Bookings) AS (
    SELECT ID, ClassReference, StartDate, EndDate, Duration
      , SUM(CountBookings) OVER (PARTITION BY ID, ClassReference)
    FROM sorting WHERE (RowNumStart = 1) AND (RowNumEnd = 1)
  )
SELECT ID, ClassReference, StartDate, EndDate, Duration, Bookings
  , CASE 
      WHEN Duration >= 5 OR Bookings >= 2 THEN 'Resident' ELSE 'Visitor'
    END AS [Type]
FROM counting
ORDER BY ID, StartDate;

使用此数据进行测试:

使用最近 12 个月的过滤器,查询 returns:

所以学生 1 在 class C2 中是 "Resident",但在 Class C1 中是访客。