Postgres - 根据 IN 和 OUT 条目计算总工作时间

Postgres - calculate total working hours based on IN and OUT entry

我有以下 tables:

1) 我的公司table

 id |   c_name   |  c_code  | status 
----+------------+----------+--------
  1 | AAAAAAAAAA |  AA1234  | Active 


2) 我的用户 table

 id |    c_id    | u_name   | status | emp_id 
----+------------+----------+--------+--------
  1 |      1     | XXXXXXXX | Active |    1   
  2 |      1     | YYYYYYYY | Active |    2   


3) 我的出勤率table

 id |  u_id  |        swipe_time      | status 
----+--------+------------------------+--------
  1 |   1    |  2020-08-20 16:00:00   | IN     
  2 |   1    |  2020-08-20 20:00:00   | OUT    
  3 |   1    |  2020-08-20 21:00:00   | IN     
  4 |   1    |  2020-08-21 01:00:00   | OUT    
  5 |   1    |  2020-08-21 16:00:00   | IN     
  6 |   1    |  2020-08-21 19:00:00   | OUT    


我需要按日期分组计算出勤率,u_id如下:
注意:查询参数为“起始日期”、“截止日期”和“公司 ID”

u_id |   u_name  |     date    |        in_time       |        out_time      | hrs 
-----+-----------+-------------+----------------------+----------------------+-----
 1   |  XXXXXXXX | 2020-08-20  |  2020-08-20 16:00:00 |  2020-08-21 01:00:00 |  7  
 1   |  XXXXXXXX | 2020-08-21  |  2020-08-21 16:00:00 |  2020-08-21 19:00:00 |  4  
 2   |  YYYYYYYY |     null    |        null          |        null          |  0  


这在 PostgreSQL 中可行吗?

使用lead window 函数使其更简单易读。对于平衡的 IN 和 OUT 出勤事件,这将工作正常,否则出勤时间将为空值。这是有道理的,因为此人尚未离开或尚未出席或出席数据已损坏。

select 
 u.id u_id, u.u_name,
 t.date_in date, t.t_in in_time, t.t_out out_time,
 extract('hour' from t.t_out - t.t_in) hrs
from users u
left outer join 
(
  select u_id,
  date_trunc('day', swipe_time) date_in,
  swipe_time t_in, 
  lead(swipe_time, 1) over (partition by u_id order by u_id, swipe_time) t_out,
  status
  from attendance
) t 
on u.id = t.u_id
where t.status = 'IN';

棘手的部分是将涵盖两天(日历)的一行扩展为两行,并正确分配“下一天”的时间。

第一部分是获取将 IN/OUT 对组合成一行的主元 table。

一个简单(但不是很有效)的方法是:

  select ain.u_id, 
         ain.swipe_time as time_in,
         (select min(aout.swipe_time)
          from attendance aout
          where aout.u_id = ain.u_id
            and aout.status = 'OUT'
            and aout.swipe_time > ain.swipe_time) as time_out
  from attendance ain
  where ain.status = 'IN'

下一步是将超过一天的行分成两行。

这是假设您的 IN/OUT 对不会超过两天!

with inout as (
  select ain.u_id, 
         ain.swipe_time as time_in,
         (select min(aout.swipe_time)
          from attendance aout
          where aout.u_id = ain.u_id
            and aout.status = 'OUT'
            and aout.swipe_time > ain.swipe_time) as time_out
  from attendance ain
  where ain.status = 'IN'
), expanded as (
  select u_id, 
         time_in::date as "date", 
         time_in,
         time_out
  from inout     
  where time_in::date = time_out::date  
  union all
  select i.u_id, 
         x.time_in::date as date, 
         x.time_in,
         x.time_out
  from inout i   
    cross join lateral (
       select i.u_id, 
              i.time_in, 
              i.time_in::date + 1 as time_out
       union all
       select i.u_id, 
              i.time_out::date, 
              i.time_out
    ) x
  where i.time_out::date > i.time_in::date  
)
select *
from expanded;

以上returns以下为您的示例数据:

u_id | date       | time_in             | time_out           
-----+------------+---------------------+--------------------
   1 | 2020-08-20 | 2020-08-20 16:00:00 | 2020-08-20 20:00:00
   1 | 2020-08-20 | 2020-08-20 21:00:00 | 2020-08-21 00:00:00
   1 | 2020-08-21 | 2020-08-21 00:00:00 | 2020-08-21 01:00:00
   1 | 2020-08-21 | 2020-08-21 16:00:00 | 2020-08-21 19:00:00

这是如何工作的?

所以我们首先 select 所有与这部分在同一天开始和结束的行:

  select u_id, 
         time_in::date as "date", 
         time_in,
         time_out
  from inout     
  where time_in::date = time_out::date  

并集的第二部分通过使用交叉连接拆分跨越两天的行,该交叉连接生成一行具有原始开始时间和午夜,另一行从午夜到原始结束时间:

  select i.u_id, 
         x.time_in::date as date, 
         x.time_in,
         x.time_out
  from inout i   
    cross join lateral (
       -- this generates a row for the first of the two days
       select i.u_id, 
              i.time_in, 
              i.time_in::date + 1 as time_out
       union all
       -- this generates the row for the next day
       select i.u_id, 
              i.time_out::date, 
              i.time_out
    ) x
  where i.time_out::date > i.time_in::date 

最后,通过按用户和日期对新的“扩展”行进行聚合,并左连接到 users table 以获取用户名。

with inout as (
  select ain.u_id, 
         ain.swipe_time as time_in,
         (select min(aout.swipe_time)
          from attendance aout
          where aout.u_id = ain.u_id
            and aout.status = 'OUT'
            and aout.swipe_time > ain.swipe_time) as time_out
  from attendance ain
  where ain.status = 'IN'
), expanded as (
  select u_id, 
         time_in::date as "date", 
         time_in,
         time_out
  from inout     
  where time_in::date = time_out::date  
  union all
  select i.u_id, 
         x.time_in::date as date, 
         x.time_in,
         x.time_out
  from inout i   
    cross join lateral (
       select i.u_id, 
              i.time_in, 
              i.time_in::date + 1 as time_out
       union all
       select i.u_id, 
              i.time_out::date, 
              i.time_out
    ) x
  where i.time_out::date > i.time_in::date  
)
select u.id,
       u.u_name,
       e."date", 
       min(e.time_in) as time_in,
       max(e.time_out) as time_out,
       sum(e.time_out - e.time_in) as duration
from users u
  left join expanded e on u.id = e.u_id
group by u.id, u.u_name, e."date"
order by u.id, e."date";

然后结果是:

u_id | date       | time_in             | time_out            | duration                                     
-----+------------+---------------------+---------------------+----------------------------------------------
   1 | 2020-08-20 | 2020-08-20 16:00:00 | 2020-08-21 00:00:00 | 0 years 0 mons 0 days 7 hours 0 mins 0.0 secs
   1 | 2020-08-21 | 2020-08-21 00:00:00 | 2020-08-21 19:00:00 | 0 years 0 mons 0 days 4 hours 0 mins 0.0 secs

“持续时间”列是您喜欢的interval which you can format

Online example