从 on/off activity 日志计算总活动时间
Calculating total active time from on/off activity log
假设我有一个支持系统。当代理可以处理票证时,他们会自行启动。当它们不再可用时,它们会自行关闭。很简单的。每次有人打开或关闭自己时,操作都会存储在数据库 table 中。像这样:
USERID ACTION CREATED
1 ON 2016-01-10 12:00
2 ON 2016-01-10 13:00
2 OFF 2016-01-10 15:00
1 OFF 2016-01-10 17:00
1 ON 2016-01-11 10:00
1 OFF 2016-01-11 11:00
在上面的例子中,用户 1 总共活跃了 6 小时。用户 2 总共活跃了 2 小时。如何编写一个给我这些数据的查询查询,如下所示:
USERID TOTAL
1 6 hours
2 2 hours
该查询还需要处理用户已开启并仍处于活动状态(仅记录了开启操作,没有相应的关闭)的情况。
老实说,我什至不知道从哪里开始。我在想我可能需要创建一系列时间戳,然后用它做一些事情。又或许这只是那种写代码更容易应对的情况。无论如何,任何帮助表示赞赏。
设置:
create table a_table (userid int, action text, created timestamp);
insert into a_table values
(1, 'ON', '2016-01-10 12:00'),
(2, 'ON', '2016-01-10 13:00'),
(2, 'OFF', '2016-01-10 15:00'),
(1, 'OFF', '2016-01-10 17:00'),
(1, 'ON', '2016-01-11 10:00'),
(1, 'OFF', '2016-01-11 11:00'),
(1, 'ON', '2016-01-11 20:00');-- added
想法:
select distinct on (a1.userid, a1.created)
a1.userid, a1.action, a1.created,
a2.userid, a2.action, a2.created,
coalesce(a2.created, '2016-01-11 24:00:00')- a1.created as total -- '2016-01-11 24:00:00' = the end of reported period
from a_table a1
left join a_table a2
on a1.userid = a2.userid and a1.action = 'ON' and a2.action = 'OFF' and a1.created < a2.created
where a1.action = 'ON'
order by a1.userid, a1.created, a2.created;
userid | action | created | userid | action | created | total
--------+--------+---------------------+--------+--------+---------------------+----------
1 | ON | 2016-01-10 12:00:00 | 1 | OFF | 2016-01-10 17:00:00 | 05:00:00
1 | ON | 2016-01-11 10:00:00 | 1 | OFF | 2016-01-11 11:00:00 | 01:00:00
1 | ON | 2016-01-11 20:00:00 | | | | 04:00:00
2 | ON | 2016-01-10 13:00:00 | 2 | OFF | 2016-01-10 15:00:00 | 02:00:00
(4 rows)
查询:
select userid, sum(total)
from (
select distinct on (a1.userid, a1.created)
a1.userid,
coalesce(a2.created, '2016-01-11 24:00:00')- a1.created as total
from a_table a1
left join a_table a2
on a1.userid = a2.userid and a1.action = 'ON' and a2.action = 'OFF' and a1.created < a2.created
where a1.action = 'ON'
order by a1.userid, a1.created, a2.created
) s
group by 1
order by 1;
userid | sum
--------+----------
1 | 10:00:00
2 | 02:00:00
(2 rows)
- 我为
user_id = 3
添加了另一行 ON
而不是 OFF
- nextAction 并不是真的需要它,但有助于查看
LEAD
行是什么
基本查询:
SELECT "USERID", "ACTION", "CREATED",
LEAD("CREATED") OVER (PARTITION BY "USERID" ORDER BY "CREATED") nextDate,
LEAD("ACTION") OVER (PARTITION BY "USERID" ORDER BY "CREATED") nextAction
FROM activity
输出
| USERID | ACTION | CREATED | nextDate | nextAction |
|--------|--------|---------------------------|---------------------------|------------|
| 1 | ON | January, 10 2016 12:00:00 | January, 10 2016 17:00:00 | OFF |
| 1 | OFF | January, 10 2016 17:00:00 | January, 11 2016 10:00:00 | ON |
| 1 | ON | January, 11 2016 10:00:00 | January, 11 2016 11:00:00 | OFF |
| 1 | OFF | January, 11 2016 11:00:00 | (null) | (null) |
| 2 | ON | January, 10 2016 13:00:00 | January, 10 2016 15:00:00 | OFF |
| 2 | OFF | January, 10 2016 15:00:00 | (null) | (null) |
| 3 | ON | July, 08 2016 05:00:00 | (null) | (null) |
最终查询:
COALESCE(nextDate, NOW())
对于 ON
没有 OFF
SELECT "USERID",
SUM (COALESCE(nextDate, NOW()) - "CREATED") onTime
FROM (
SELECT "USERID", "ACTION", "CREATED",
LEAD("CREATED") OVER (PARTITION BY "USERID" ORDER BY "CREATED") nextDate,
LEAD("ACTION") OVER (PARTITION BY "USERID" ORDER BY "CREATED") nextaction
FROM activity
) T
WHERE "ACTION" = 'ON'
GROUP BY "USERID"
输出
| USERID | ontime |
|--------|-------------------------------------------------------|
| 1 | 0 years 0 mons 0 days 6 hours 0 mins 0.00 secs |
| 2 | 0 years 0 mons 0 days 2 hours 0 mins 0.00 secs |
| 3 | 0 years 0 mons 0 days 10 hours 47 mins 52.920889 secs |
您可以使用如下查询:
SELECT USERID,
CASE WHEN COUNT(*) = 1 THEN NOW() ELSE MAX(CREATED) END - MIN(CREATED) onTime
FROM (
SELECT USERID, ACTION, CREATED,
COUNT(CASE WHEN ACTION = 'ON' THEN 1 END) OVER
(PARTITION BY USERID ORDER BY CREATED) AS grp
FROM mytable) AS t
GROUP BY USERID, grp
测试table/data:
drop table if exists a_table;
create table a_table (seq serial primary key, userid int, action text, created timestamp);
create index a_table_i on a_table (userid, action, created);
insert into a_table (userid, action, created) values
(1, 'ON' , '2016-01-10 12:00'),
(1, 'OFF' , '2016-01-10 17:00'),
(1, 'ON' , '2016-01-11 10:00'),
(1, 'OFF' , '2016-01-11 11:00'),
(1, 'ON' , '2016-01-11 20:00'),
(1, 'ON' , '2016-01-11 21:00'), -- an "ON" without an "OFF"
(1, 'OFF' , '2016-01-11 21:00'),
(1, 'OFF' , '2016-01-11 22:00'), -- an "OFF" without an "ON"
(1, 'ON' , '2016-01-11 21:10'),
(1, 'ON' , '2016-01-11 21:20'), -- an "ON" without an "OFF"
(1, 'OFF' , '2016-01-11 21:30'),
(2, 'ON' , '2016-01-10 13:00'),
(2, 'OFF' , '2016-01-10 15:00'),
(2, 'ON' , '2016-01-11 13:00'),
(2, 'OFF' , '2016-01-12 15:00');
查询 table 请记住,可以存在许多 "ON" 用户操作而无需任何 "OFF"(反之亦然):
SELECT USERID, SUM(FINISHED - CREATED) AS ACTIVE_TIME
FROM (
SELECT *,
-- GET THE NEXT 'OFF' ACTION RIGTH AFTER IT
COALESCE((SELECT CREATED FROM a_table T1 WHERE T1.USERID = T.USERID AND T1.SEQ > T.SEQ ORDER BY T1.SEQ LIMIT 1),CURRENT_TIMESTAMP) AS FINISHED
FROM a_table T
-- LIST ONLY RECORDS WHERE ACTION IS 'ON' AND THE USER'S ACTION RIGHT BEFORE WAS 'OFF' OR 'NONE'
WHERE ACTION = 'ON'
AND COALESCE((SELECT ACTION FROM a_table T1 WHERE T1.USERID = T.USERID AND T1.SEQ < T.SEQ ORDER BY T1.SEQ DESC LIMIT 1),'OFF') = 'OFF'
) A
GROUP BY USERID
结果:
userid active_time
1 07:10:00
2 1 day 04:00:00
假设我有一个支持系统。当代理可以处理票证时,他们会自行启动。当它们不再可用时,它们会自行关闭。很简单的。每次有人打开或关闭自己时,操作都会存储在数据库 table 中。像这样:
USERID ACTION CREATED
1 ON 2016-01-10 12:00
2 ON 2016-01-10 13:00
2 OFF 2016-01-10 15:00
1 OFF 2016-01-10 17:00
1 ON 2016-01-11 10:00
1 OFF 2016-01-11 11:00
在上面的例子中,用户 1 总共活跃了 6 小时。用户 2 总共活跃了 2 小时。如何编写一个给我这些数据的查询查询,如下所示:
USERID TOTAL
1 6 hours
2 2 hours
该查询还需要处理用户已开启并仍处于活动状态(仅记录了开启操作,没有相应的关闭)的情况。
老实说,我什至不知道从哪里开始。我在想我可能需要创建一系列时间戳,然后用它做一些事情。又或许这只是那种写代码更容易应对的情况。无论如何,任何帮助表示赞赏。
设置:
create table a_table (userid int, action text, created timestamp);
insert into a_table values
(1, 'ON', '2016-01-10 12:00'),
(2, 'ON', '2016-01-10 13:00'),
(2, 'OFF', '2016-01-10 15:00'),
(1, 'OFF', '2016-01-10 17:00'),
(1, 'ON', '2016-01-11 10:00'),
(1, 'OFF', '2016-01-11 11:00'),
(1, 'ON', '2016-01-11 20:00');-- added
想法:
select distinct on (a1.userid, a1.created)
a1.userid, a1.action, a1.created,
a2.userid, a2.action, a2.created,
coalesce(a2.created, '2016-01-11 24:00:00')- a1.created as total -- '2016-01-11 24:00:00' = the end of reported period
from a_table a1
left join a_table a2
on a1.userid = a2.userid and a1.action = 'ON' and a2.action = 'OFF' and a1.created < a2.created
where a1.action = 'ON'
order by a1.userid, a1.created, a2.created;
userid | action | created | userid | action | created | total
--------+--------+---------------------+--------+--------+---------------------+----------
1 | ON | 2016-01-10 12:00:00 | 1 | OFF | 2016-01-10 17:00:00 | 05:00:00
1 | ON | 2016-01-11 10:00:00 | 1 | OFF | 2016-01-11 11:00:00 | 01:00:00
1 | ON | 2016-01-11 20:00:00 | | | | 04:00:00
2 | ON | 2016-01-10 13:00:00 | 2 | OFF | 2016-01-10 15:00:00 | 02:00:00
(4 rows)
查询:
select userid, sum(total)
from (
select distinct on (a1.userid, a1.created)
a1.userid,
coalesce(a2.created, '2016-01-11 24:00:00')- a1.created as total
from a_table a1
left join a_table a2
on a1.userid = a2.userid and a1.action = 'ON' and a2.action = 'OFF' and a1.created < a2.created
where a1.action = 'ON'
order by a1.userid, a1.created, a2.created
) s
group by 1
order by 1;
userid | sum
--------+----------
1 | 10:00:00
2 | 02:00:00
(2 rows)
- 我为
user_id = 3
添加了另一行ON
而不是OFF
- nextAction 并不是真的需要它,但有助于查看
LEAD
行是什么
基本查询:
SELECT "USERID", "ACTION", "CREATED",
LEAD("CREATED") OVER (PARTITION BY "USERID" ORDER BY "CREATED") nextDate,
LEAD("ACTION") OVER (PARTITION BY "USERID" ORDER BY "CREATED") nextAction
FROM activity
输出
| USERID | ACTION | CREATED | nextDate | nextAction |
|--------|--------|---------------------------|---------------------------|------------|
| 1 | ON | January, 10 2016 12:00:00 | January, 10 2016 17:00:00 | OFF |
| 1 | OFF | January, 10 2016 17:00:00 | January, 11 2016 10:00:00 | ON |
| 1 | ON | January, 11 2016 10:00:00 | January, 11 2016 11:00:00 | OFF |
| 1 | OFF | January, 11 2016 11:00:00 | (null) | (null) |
| 2 | ON | January, 10 2016 13:00:00 | January, 10 2016 15:00:00 | OFF |
| 2 | OFF | January, 10 2016 15:00:00 | (null) | (null) |
| 3 | ON | July, 08 2016 05:00:00 | (null) | (null) |
最终查询:
COALESCE(nextDate, NOW())
对于 ON
没有 OFF
SELECT "USERID",
SUM (COALESCE(nextDate, NOW()) - "CREATED") onTime
FROM (
SELECT "USERID", "ACTION", "CREATED",
LEAD("CREATED") OVER (PARTITION BY "USERID" ORDER BY "CREATED") nextDate,
LEAD("ACTION") OVER (PARTITION BY "USERID" ORDER BY "CREATED") nextaction
FROM activity
) T
WHERE "ACTION" = 'ON'
GROUP BY "USERID"
输出
| USERID | ontime |
|--------|-------------------------------------------------------|
| 1 | 0 years 0 mons 0 days 6 hours 0 mins 0.00 secs |
| 2 | 0 years 0 mons 0 days 2 hours 0 mins 0.00 secs |
| 3 | 0 years 0 mons 0 days 10 hours 47 mins 52.920889 secs |
您可以使用如下查询:
SELECT USERID,
CASE WHEN COUNT(*) = 1 THEN NOW() ELSE MAX(CREATED) END - MIN(CREATED) onTime
FROM (
SELECT USERID, ACTION, CREATED,
COUNT(CASE WHEN ACTION = 'ON' THEN 1 END) OVER
(PARTITION BY USERID ORDER BY CREATED) AS grp
FROM mytable) AS t
GROUP BY USERID, grp
测试table/data:
drop table if exists a_table;
create table a_table (seq serial primary key, userid int, action text, created timestamp);
create index a_table_i on a_table (userid, action, created);
insert into a_table (userid, action, created) values
(1, 'ON' , '2016-01-10 12:00'),
(1, 'OFF' , '2016-01-10 17:00'),
(1, 'ON' , '2016-01-11 10:00'),
(1, 'OFF' , '2016-01-11 11:00'),
(1, 'ON' , '2016-01-11 20:00'),
(1, 'ON' , '2016-01-11 21:00'), -- an "ON" without an "OFF"
(1, 'OFF' , '2016-01-11 21:00'),
(1, 'OFF' , '2016-01-11 22:00'), -- an "OFF" without an "ON"
(1, 'ON' , '2016-01-11 21:10'),
(1, 'ON' , '2016-01-11 21:20'), -- an "ON" without an "OFF"
(1, 'OFF' , '2016-01-11 21:30'),
(2, 'ON' , '2016-01-10 13:00'),
(2, 'OFF' , '2016-01-10 15:00'),
(2, 'ON' , '2016-01-11 13:00'),
(2, 'OFF' , '2016-01-12 15:00');
查询 table 请记住,可以存在许多 "ON" 用户操作而无需任何 "OFF"(反之亦然):
SELECT USERID, SUM(FINISHED - CREATED) AS ACTIVE_TIME
FROM (
SELECT *,
-- GET THE NEXT 'OFF' ACTION RIGTH AFTER IT
COALESCE((SELECT CREATED FROM a_table T1 WHERE T1.USERID = T.USERID AND T1.SEQ > T.SEQ ORDER BY T1.SEQ LIMIT 1),CURRENT_TIMESTAMP) AS FINISHED
FROM a_table T
-- LIST ONLY RECORDS WHERE ACTION IS 'ON' AND THE USER'S ACTION RIGHT BEFORE WAS 'OFF' OR 'NONE'
WHERE ACTION = 'ON'
AND COALESCE((SELECT ACTION FROM a_table T1 WHERE T1.USERID = T.USERID AND T1.SEQ < T.SEQ ORDER BY T1.SEQ DESC LIMIT 1),'OFF') = 'OFF'
) A
GROUP BY USERID
结果:
userid active_time
1 07:10:00
2 1 day 04:00:00