如何在特定事件之前对组中的事件进行计数
How to count events in groups before a specific event
假设我有一个 table 如下所示:
\d events
Table "public.events
Column | Type | Modifiers
------------+--------------------------+-----------
my_id | bigint |
tstamp | timestamp with time zone |
event_type | text |
示例数据:
my_id | tstamp | event_type
------------+----------------------------+------------
1111111111 | 2015-11-14 09:02:46.185+02 | A
1111111111 | 2015-11-14 17:32:58+02 | B
1111111111 | 2015-11-28 15:06:30.895+02 | A
1111111111 | 2015-12-05 15:22:31.582+02 | A
2222222222 | 2015-11-17 15:06:07.481+02 | A
2222222222 | 2015-11-17 20:30:03+02 | B
2222222222 | 2015-12-04 15:36:31.532+02 | A
3333333333 | 2015-11-20 15:06:01.621+02 | A
3333333333 | 2015-11-20 19:15:09.908+02 | A
3333333333 | 2015-11-21 15:06:01.621+02 | A
3333333333 | 2015-11-26 09:07:45.134+02 | B
3333333333 | 2015-11-27 14:39:31.657+02 | A
4444444444 | 2015-12-05 10:21:21.441+02 | A
4444444444 | 2015-12-05 20:00:40.772+02 | B
我想计算事件 B 之前每个 my_id 的所有事件 A。
预期输出为:
my_id | events_before_B
-----------+-----------------
1111111111 | 1
2222222222 | 2
3333333333 | 3
4444444444 | 1
Postgres 版本 9.4
做一个自我LEFT JOIN
,用GROUP BY
:
select e1.my_id, count(e2.my_id)
from events e1
left join (select my_id, min(tstamp) as b_tstamp from events
where event_type = 'B'
group by my_id) e2
ON e1.my_id = e2.my_id AND e1.tstamp < e2.b_tstamp
group by e1.my_id
或者,NOT EXISTS
版本:
select my_id, count(*)
from events e1
where not exists (select 1 from events e2
where e2.my_id = e1.my_id
and e2.event_type = 'B'
and e2.tstamp < e1.tstamp)
group by my_id
SELECT my_id
,position('B' IN string_agg(event_type, '')) - 1 events_before_B
FROM events
GROUP BY my_id
ORDER BY my_id
解释:
select my_id,string_agg(event_type,'') from events group by my_id
产量
my_id string_agg
---------- ----------
1111111111 ABAA
2222222222 ABA
3333333333 AAABA
4444444444 AB
select position('B' in 'AAABAA')
可以用来找出B
在字符串AAABAA
中的位置
position
--------
4
所以 select position('B' in 'AAABAA')-1
产生 A
的位置就在 B
之前
position
--------
3
假设我有一个 table 如下所示:
\d events
Table "public.events
Column | Type | Modifiers
------------+--------------------------+-----------
my_id | bigint |
tstamp | timestamp with time zone |
event_type | text |
示例数据:
my_id | tstamp | event_type
------------+----------------------------+------------
1111111111 | 2015-11-14 09:02:46.185+02 | A
1111111111 | 2015-11-14 17:32:58+02 | B
1111111111 | 2015-11-28 15:06:30.895+02 | A
1111111111 | 2015-12-05 15:22:31.582+02 | A
2222222222 | 2015-11-17 15:06:07.481+02 | A
2222222222 | 2015-11-17 20:30:03+02 | B
2222222222 | 2015-12-04 15:36:31.532+02 | A
3333333333 | 2015-11-20 15:06:01.621+02 | A
3333333333 | 2015-11-20 19:15:09.908+02 | A
3333333333 | 2015-11-21 15:06:01.621+02 | A
3333333333 | 2015-11-26 09:07:45.134+02 | B
3333333333 | 2015-11-27 14:39:31.657+02 | A
4444444444 | 2015-12-05 10:21:21.441+02 | A
4444444444 | 2015-12-05 20:00:40.772+02 | B
我想计算事件 B 之前每个 my_id 的所有事件 A。
预期输出为:
my_id | events_before_B
-----------+-----------------
1111111111 | 1
2222222222 | 2
3333333333 | 3
4444444444 | 1
Postgres 版本 9.4
做一个自我LEFT JOIN
,用GROUP BY
:
select e1.my_id, count(e2.my_id)
from events e1
left join (select my_id, min(tstamp) as b_tstamp from events
where event_type = 'B'
group by my_id) e2
ON e1.my_id = e2.my_id AND e1.tstamp < e2.b_tstamp
group by e1.my_id
或者,NOT EXISTS
版本:
select my_id, count(*)
from events e1
where not exists (select 1 from events e2
where e2.my_id = e1.my_id
and e2.event_type = 'B'
and e2.tstamp < e1.tstamp)
group by my_id
SELECT my_id
,position('B' IN string_agg(event_type, '')) - 1 events_before_B
FROM events
GROUP BY my_id
ORDER BY my_id
解释:
select my_id,string_agg(event_type,'') from events group by my_id
产量
my_id string_agg
---------- ----------
1111111111 ABAA
2222222222 ABA
3333333333 AAABA
4444444444 AB
select position('B' in 'AAABAA')
可以用来找出B
在字符串AAABAA
position
--------
4
所以 select position('B' in 'AAABAA')-1
产生 A
的位置就在 B
position
--------
3