计算从开始和结束时间跨度得出的每分钟会话数

Count sessions per minute derived from start and end timespans

我有一个 table,其中包含用户 activity 的记录,涵盖由开始和结束时间指示的跨度。我正在寻找前一天每单位时间在系统中活跃的用户数。

最大会话长度为一个小时,并且它们不会跨越小时界限。会话可以在同一分钟内结束并开始新的会话。

这是查询的简化版本:

with minutes AS (
    -- ignore this...it generates a day's worth of timestamps for each minute
    -- it's hairy but is what I'm stuck with on redshift
    select (dateadd(minute, -row_number() over (order by true), sysdate::date)) as minute
        from seed_table limit 1440
),
sessions as (
    select sid, ts_start, ts_end
    from user_sessions s
    where ts_end >= sysdate::date-'1 day'::interval 
        and ts_start < sysdate::date
)
select m.minute, count(distinct(s.sid))
from minutes m
left join sessions s on s.ts_end >= m.minute and s.ts_start < m.minute+'1 min'::interval
group by 1

我正在努力避免那种讨厌的左连接:

->  XN Nested Loop Left Join DS_BCAST_INNER  (cost=6913826151.95..4727012848741.55 rows=410434560 width=166)
    Join Filter: (("inner".ts_start < ("outer"."minute" + '00:01:00'::interval)) AND ("inner".ts_end >= "outer"."minute"))

根据 Gordon Linoff 的回答,以下是几乎对我有用的方法。当用户的会话在一分钟内相互转换时,它会被低估。似乎是正确的方向。出于同样的原因,原始查询可能会过度计数,但在一分钟内获取不同会话 ID 计数的机会解决了这个问题。

select minute, sum(count) over (order by minute rows unbounded preceding) as users
from (
    select minute, sum(count) as count
    from (
        (
            select date_trunc('minute', ts_start) as minute, count(*) as count
            from sessions
            group by 1
        ) union all (
            select date_trunc('minute', ts_end) as minute, - count(*) as count
            from sessions
            group by 1
        )
    ) s1
    group by minute
) s2
order by minute;

为了比较,这里是一个小时的数据的计时结果:

  1. 原始查询时间:81301.345 毫秒
  2. 总计查询时间:36242.342 毫秒

您可以通过计算每分钟的启动和停止次数,然后求出总和来更快地完成此操作。结果是这样的:

select minute, sum(cnt) over (order by minute)
from ((select date_trunc('minute', ts_start) as minute, count(*) as cnt
       from sessions
       group by 1
      ) union all
      (select date_trunc('minute', ts_end), - count(*)
       from sessions
       group by 1
      )
     ) s
group by minute
order by minute;