PostgreSQL 数据的分组

Grouping of PostgreSQL data

我有一个 postgresql table,其中有 date/time 记录的事件。 table 包含列 ideventtimestamp

我的输出必须是这样的:

'Day', '1st Timers', '2nd Timers', '3rd Timers', '3+ Timers'

第一个计时器是所有第一次完成事件的id。 第二次计时器是所有第二次完成该事件的 ID。等等等等

是否可以使用单个 SQL 查询?

编辑:样本数据和根据请求输出

user_id date                event
1       09/03/15 14:08      opened
2      10/03/15 14:08       opened
1      11/03/15 14:08       opened
4      14/03/15 14:08       opened
1      15/03/15 14:08       opened
5      16/03/15 14:08       opened
1      17/03/15 14:08       opened
4      17/03/15 14:08       opened
6      18/03/15 14:08       opened
1      18/03/15 14:08       opened
6      18/03/15 14:08       other


Output (for event=opened)
date        1time   2times  3times  4times  5times
09/03/15    1       0       0       0       0
10/03/15    1       0       0       0       0
11/03/15    0       1       0       0       0
14/03/15    1       0       0       0       0
15/03/15    0       0       1       0       0
16/03/15    1       0       0       0       0
17/03/15    0       1       0       1       0
18/03/15    1       0       0       0       1

对于每个日期,您似乎都想统计点击 1 次、2 次等的用户数。我将其视为 row_number() 后跟条件聚合:

select thedate,
       sum(case when seqnum = 1 then 1 else 0 end) as time_1,
       sum(case when seqnum = 2 then 1 else 0 end) as time_2,
       sum(case when seqnum = 3 then 1 else 0 end) as time_3,
       sum(case when seqnum = 4 then 1 else 0 end) as time_4,
       sum(case when seqnum = 5 then 1 else 0 end) as time_5
from (select t.*, date_trunc('day', date) as thedate
             row_number() over (partition by user_id order by date_trunc('day', date)) as seqnum
      from table t
      where event = 'opened'
     ) t
group by thedate
order by thedate;

汇总FILTER

从 Postgres 9.4 开始使用新聚合 FILTER 子句:

SELECT event_time::date
     , count(*) FILTER (WHERE rn = 1) AS times_1
     , count(*) FILTER (WHERE rn = 2) AS times_2
     , count(*) FILTER (WHERE rn = 3) AS times_3
    -- etc.
from (
   SELECT *, row_number() OVER (PARTITION BY user_id ORDER BY event_time) AS rn
   FROM   tbl
   ) t
GROUP  BY 1
ORDER  BY 1;

相关:

  • How can I simplify this game statistics query?

关于演员 event_time::date:

  • How to get the date and time from timestamp in PostgreSQL select query?

交叉表

或者使用实际的交叉表查询(更快)。适用于任何现代 Postgres 版本。 先读一下:

  • PostgreSQL Crosstab Query

SELECT * FROM crosstab(
       'SELECT event_time::date, rn, count(*)::int AS ct
        FROM  (
           SELECT *, row_number() OVER (PARTITION BY user_id ORDER BY event_time) AS rn
           FROM   tbl
           ) t
        GROUP  BY 1, 2
        ORDER  BY 1'

      ,$$SELECT * FROM unnest ('{1,2,3}'::int[])$$
   ) AS ct (day date, times_1 int, times_2 int, times_3 int);