如何在 sql 中使用窗口函数来保存记录

how do I use windowing functions in sql to persist a record

我有一个数据集,我试图根据某个事件发生(即加载)的时间戳创建一个 "session id" 在我的例子中

我的数据:

userid  event  timestamp
xyz     load   '2016-12-01 08:21:13:000'
xyz     view   '2016-12-01 08:21:14:000'
xyz     view   '2016-12-01 08:21:16:000'
xyz     exit   '2016-12-01 08:21:17:000'
xyz     load   '2016-12-02 08:01:13:000'
xyz     view   '2016-12-02 08:01:16:000'
abc     load   '2016-12-01 08:11:13:000'
abc     view   '2016-12-01 08:11:14:000'

我想要实现的是创建一个名为 session_start_timestamp 的新列,其中针对每个用户的最后一个 "load" 标记该行。

我知道如何通过创建一个子集 table(通过采用最小时间戳和自连接)来做到这一点,但是是否有 lag/lead/max/partition 函数可以代替它来做到这一点?

最终输出应如下所示:

userid  event  timestamp                  session_start_timestamp
xyz     load   '2016-12-01 08:21:13:000'  '2016-12-01 08:21:13:000'
xyz     view   '2016-12-01 08:21:14:000'  '2016-12-01 08:21:13:000'
xyz     view   '2016-12-01 08:21:16:000'  '2016-12-01 08:21:13:000'
xyz     exit   '2016-12-01 08:21:17:000'  '2016-12-01 08:21:13:000'
xyz     load   '2016-12-02 08:01:13:000'  '2016-12-02 08:01:13:000'
xyz     view   '2016-12-02 08:01:16:000'  '2016-12-02 08:01:13:000'
abc     load   '2016-12-01 08:11:13:000'  '2016-12-01 08:11:13:000'
abc     view   '2016-12-01 08:11:14:000'  '2016-12-01 08:11:13:000'

这是一个 gap/island 问题:

SQL DEMO (postgresql)

  1. 你计算差距或断点。
  2. 然后使用累积SUM()计算组
  3. 然后select每组MIN()时间

--

WITH gap as (
    SELECT *, CASE WHEN "event" = 'load' THEN 1 ELSE 0 END as gap
    FROM Table1
), island as (
    SELECT *, SUM(gap) OVER (PARTITION BY "userid" ORDER BY "timestamp" ) as grp
    FROM gap
)    
SELECT *, MIN("timestamp") OVER (PARTITION BY "userid", "grp") as new_timestamp
FROM island

输出

您可以合并前两个查询:

WITH island as (
    SELECT *, SUM (CASE WHEN "event" = 'load' THEN 1 ELSE 0 END ) 
              OVER (PARTITION BY "userid" ORDER BY "timestamp" ) as grp
    FROM Table1
)    
SELECT *, MIN("timestamp") OVER (PARTITION BY "userid", "grp") as new_timestamp
FROM island