在 PostgreSQL 中查找多个用户的时间戳差距

Finding Gaps in Timestamps for Multiple Users in PostgreSQL

我正在处理一个数据集,其中包含过去 5 年多个办公室的入住和退房时间。我被要求从事的项目之一是计算每个房间在不同时间范围内(每天、每周、每月等)忙碌和空闲的时间,假设正常工作时间(早上 8 点到下午 5 点)。两天的数据集示例如下所示:

room_id         start_dt                end_dt
Room: Room 3    2019-05-04 09:00:00     2019-05-04 11:30:00
Room: Room 3    2019-05-04 11:30:00     2019-05-04 12:15:00
Room: Room 3    2019-05-04 12:30:00     2019-05-04 13:00:00
Room: Room 3    2019-05-05 09:00:00     2019-05-05 13:00:00
Room: Room 4    2019-05-04 08:00:00     2019-05-04 09:00:00
Room: Room 4    2019-05-04 09:00:00     2019-05-04 11:00:00
Room: Room 4    2019-05-04 14:00:00     2019-05-04 16:00:00
Room: Room 4    2019-05-05 08:30:00     2019-05-05 09:30:00

我借用并修改了@Branko Dimitrijevic(完整 post:SQL Query to show gaps between multiple date ranges)在之前的 Whosebug post 中编写的一些代码,以尝试处理多个不同的房间。下面是我修改后的代码,在 SELECT 子句中有两个 room_id 实例用于调试目的:

SELECT t1.room_id, t2.room_id, end_dt, start_dt, start_dt - end_dt as gap_dur
FROM
    (
        SELECT DISTINCT room_id, start_dt, ROW_NUMBER() OVER (ORDER BY start_dt) RN
        FROM my_table T1
        WHERE
            NOT EXISTS (
                SELECT *
                FROM my_table T2
                WHERE (T1.start_dt > T2.start_dt and t1.resource = t2.resource)
                    AND (T1.start_dt < T2.end_dt and t1.resource = t2.resource)
            )
        ) T1
    JOIN (
        SELECT DISTINCT resource, end_dt, ROW_NUMBER() OVER (ORDER BY end_dt) RN
        FROM my_table T1
        WHERE
            NOT EXISTS (
                SELECT *
                FROM my_table T2
                WHERE (T1.end_dt > T2.start_dt and t1.resource = t2.resource)
                    AND (T1.end_dt < T2.end_dt and t1.resource = t2.resource)
            )
    ) T2
    ON T1.RN - 1 = T2.RN
WHERE
    end_dt < start_dt

这是我收到的输出:

room_id         room_id         end_dt                  start_dt                gap_dur
Room: Exam 4    Room: Exam 4    2019-05-04 16:00:00     2019-05-05 08:30:00     16:30:00
Room: Exam 4    Room: Exam 3    2019-05-04 13:00:00     2019-05-04 14:00:00     01:00:00
Room: Exam 3    Room: Exam 3    2019-05-04 12:15:00     2019-05-04 12:30:00     00:15:00

但是,这在不同房间之间变得很混乱,我不知道如何实施工作日限制,例如找出早上 8 点和第一个预定活动之间的时间间隔。下面是一个最佳输出,或者至少是一种数据格式,可用于计算我需要的一些简单 GROUP BY 脚本的统计数据:

room_id         end_dt                  start_dt                gap_dur
Room: Exam 3    2019-05-04 08:00:00     2019-05-04 09:00:00     01:00:00
Room: Exam 3    2019-05-04 12:15:00     2019-05-04 12:30:00     00:15:00
Room: Exam 3    2019-05-04 13:00:00     2019-05-04 17:00:00     04:00:00
Room: Exam 3    2019-05-05 08:00:00     2019-05-05 09:00:00     01:00:00
Room: Exam 3    2019-05-05 13:00:00     2019-05-05 17:00:00     04:00:00
Room: Exam 4    2019-05-04 11:00:00     2019-05-04 14:00:00     03:00:00
Room: Exam 4    2019-05-04 16:00:00     2019-05-04 17:00:00     01:00:00
Room: Exam 4    2019-05-05 08:00:00     2019-05-05 08:30:00     00:30:00
Room: Exam 4    2019-05-05 09:30:00     2019-05-05 17:00:00     09:30:00

如有任何帮助,我们将不胜感激,如果有帮助,我们很乐意提供更多信息!

One of the projects I was asked to work on was calculating the amount of time each room is busy and vacant over various time ranges (daily, weekly, monthly, etc.) assuming normal operational hours (8am to 5pm).

根据您的样本数据,有两个假设似乎是合理的:

  • "busy" 个周期不重叠。
  • "Busy" 月经都在一天之内

如果这些不正确,我建议您提出一个新问题并提供适当的解释和示例数据。

给定日期的计算非常简单:

select date_trunc('day', start_dt),
       sum( least(extract(epoch from end_dt), v.epoch2) - 
            greatest(extract(epoch from start_dt), epoch1)
          ) as busy_seconds,
       (epoch2 - epoch1 -
        sum( least(extract(epoch from end_dt), v.epoch2) - 
             greatest(extract(epoch from start_dt), epoch1)
           )
       ) as free_seconds
from rooms r cross join
     (values (extract(epoch from date_trunc('day', start_dt) + interval '8 hour'),
              extract(epoch from date_trunc('day', start_dt) + interval '17 hour')
             )
     ) v(epoch1, epoch2)                  
group by date_trunc('day', start_dt)