PostgreSQL 数据的分组
Grouping of PostgreSQL data
我有一个 postgresql table,其中有 date/time 记录的事件。 table 包含列 id
、event
和 timestamp
。
我的输出必须是这样的:
'Day', '1st Timers', '2nd Timers', '3rd Timers', '3+ Timers'
第一个计时器是所有第一次完成事件的id。
第二次计时器是所有第二次完成该事件的 ID。等等等等
是否可以使用单个 SQL 查询?
编辑:样本数据和根据请求输出
user_id date event
1 09/03/15 14:08 opened
2 10/03/15 14:08 opened
1 11/03/15 14:08 opened
4 14/03/15 14:08 opened
1 15/03/15 14:08 opened
5 16/03/15 14:08 opened
1 17/03/15 14:08 opened
4 17/03/15 14:08 opened
6 18/03/15 14:08 opened
1 18/03/15 14:08 opened
6 18/03/15 14:08 other
Output (for event=opened)
date 1time 2times 3times 4times 5times
09/03/15 1 0 0 0 0
10/03/15 1 0 0 0 0
11/03/15 0 1 0 0 0
14/03/15 1 0 0 0 0
15/03/15 0 0 1 0 0
16/03/15 1 0 0 0 0
17/03/15 0 1 0 1 0
18/03/15 1 0 0 0 1
对于每个日期,您似乎都想统计点击 1 次、2 次等的用户数。我将其视为 row_number()
后跟条件聚合:
select thedate,
sum(case when seqnum = 1 then 1 else 0 end) as time_1,
sum(case when seqnum = 2 then 1 else 0 end) as time_2,
sum(case when seqnum = 3 then 1 else 0 end) as time_3,
sum(case when seqnum = 4 then 1 else 0 end) as time_4,
sum(case when seqnum = 5 then 1 else 0 end) as time_5
from (select t.*, date_trunc('day', date) as thedate
row_number() over (partition by user_id order by date_trunc('day', date)) as seqnum
from table t
where event = 'opened'
) t
group by thedate
order by thedate;
汇总FILTER
从 Postgres 9.4 开始使用新聚合 FILTER
子句:
SELECT event_time::date
, count(*) FILTER (WHERE rn = 1) AS times_1
, count(*) FILTER (WHERE rn = 2) AS times_2
, count(*) FILTER (WHERE rn = 3) AS times_3
-- etc.
from (
SELECT *, row_number() OVER (PARTITION BY user_id ORDER BY event_time) AS rn
FROM tbl
) t
GROUP BY 1
ORDER BY 1;
相关:
- How can I simplify this game statistics query?
关于演员 event_time::date
:
- How to get the date and time from timestamp in PostgreSQL select query?
交叉表
或者使用实际的交叉表查询(更快)。适用于任何现代 Postgres 版本。 先读一下:
- PostgreSQL Crosstab Query
SELECT * FROM crosstab(
'SELECT event_time::date, rn, count(*)::int AS ct
FROM (
SELECT *, row_number() OVER (PARTITION BY user_id ORDER BY event_time) AS rn
FROM tbl
) t
GROUP BY 1, 2
ORDER BY 1'
,$$SELECT * FROM unnest ('{1,2,3}'::int[])$$
) AS ct (day date, times_1 int, times_2 int, times_3 int);
我有一个 postgresql table,其中有 date/time 记录的事件。 table 包含列 id
、event
和 timestamp
。
我的输出必须是这样的:
'Day', '1st Timers', '2nd Timers', '3rd Timers', '3+ Timers'
第一个计时器是所有第一次完成事件的id。 第二次计时器是所有第二次完成该事件的 ID。等等等等
是否可以使用单个 SQL 查询?
编辑:样本数据和根据请求输出
user_id date event
1 09/03/15 14:08 opened
2 10/03/15 14:08 opened
1 11/03/15 14:08 opened
4 14/03/15 14:08 opened
1 15/03/15 14:08 opened
5 16/03/15 14:08 opened
1 17/03/15 14:08 opened
4 17/03/15 14:08 opened
6 18/03/15 14:08 opened
1 18/03/15 14:08 opened
6 18/03/15 14:08 other
Output (for event=opened)
date 1time 2times 3times 4times 5times
09/03/15 1 0 0 0 0
10/03/15 1 0 0 0 0
11/03/15 0 1 0 0 0
14/03/15 1 0 0 0 0
15/03/15 0 0 1 0 0
16/03/15 1 0 0 0 0
17/03/15 0 1 0 1 0
18/03/15 1 0 0 0 1
对于每个日期,您似乎都想统计点击 1 次、2 次等的用户数。我将其视为 row_number()
后跟条件聚合:
select thedate,
sum(case when seqnum = 1 then 1 else 0 end) as time_1,
sum(case when seqnum = 2 then 1 else 0 end) as time_2,
sum(case when seqnum = 3 then 1 else 0 end) as time_3,
sum(case when seqnum = 4 then 1 else 0 end) as time_4,
sum(case when seqnum = 5 then 1 else 0 end) as time_5
from (select t.*, date_trunc('day', date) as thedate
row_number() over (partition by user_id order by date_trunc('day', date)) as seqnum
from table t
where event = 'opened'
) t
group by thedate
order by thedate;
汇总FILTER
从 Postgres 9.4 开始使用新聚合 FILTER
子句:
SELECT event_time::date
, count(*) FILTER (WHERE rn = 1) AS times_1
, count(*) FILTER (WHERE rn = 2) AS times_2
, count(*) FILTER (WHERE rn = 3) AS times_3
-- etc.
from (
SELECT *, row_number() OVER (PARTITION BY user_id ORDER BY event_time) AS rn
FROM tbl
) t
GROUP BY 1
ORDER BY 1;
相关:
- How can I simplify this game statistics query?
关于演员 event_time::date
:
- How to get the date and time from timestamp in PostgreSQL select query?
交叉表
或者使用实际的交叉表查询(更快)。适用于任何现代 Postgres 版本。 先读一下:
- PostgreSQL Crosstab Query
SELECT * FROM crosstab(
'SELECT event_time::date, rn, count(*)::int AS ct
FROM (
SELECT *, row_number() OVER (PARTITION BY user_id ORDER BY event_time) AS rn
FROM tbl
) t
GROUP BY 1, 2
ORDER BY 1'
,$$SELECT * FROM unnest ('{1,2,3}'::int[])$$
) AS ct (day date, times_1 int, times_2 int, times_3 int);