如何使用 SQL 按给定时间 window 对事件时间进行排名?
How to rank event times by a given time window using SQL?
我正在努力寻找一种使用 SQL 对事件进行排名的方法。
目标是每当事件发生超过 delta
秒(例如 1 秒)之前的观察时增加排名。到目前为止,我的尝试如下所示:
select a.event_time, a.user_name, a.object_name, a.rnk, case when a.ddif <= 1000 then 0 else 1 end as new_query,
case when a.ddif <= 1000 then 0 else rnk end as new_rnk
from (
select *, rank() OVER (PARTITION BY user_name ORDER BY event_time) AS rnk,
date_diff('second',lag(event_time) OVER (PARTITION BY user_name ORDER BY event_time),event_time) as ddif
from tmp
) a
但它只给了我以下结果,我仍然不知道如何实现 yellow
中的结果(它们中的任何一个都对我来说很完美)。
如果有任何帮助,我将不胜感激。
请注意:我正在使用 Presto DB,因此我仅限于此查询引擎。
使用lag()
和累积总和来定义组。然后分配行号:
select t.*,
row_number() over (partition by user_name, grp order by event_time) as seqnum
from (select t.*,
sum(case when prev_et > event_time - interval '1' second
then 0 else 1
end) over (partition by user_name order by event_time) as grp
from (select t.*,
lag(event_time) over (partition by user_name order by event_time) as prev_et
from tmp t
) t
) t;
您可以使用 lag()
和 window sum()
:
select
t.*,
sum(case when event_time <= lag_event_time + interval '1' second then 0 else 1 end) rnk
from (
select
t.*,
lag(event_time) over(order by event_time partition by user_name) lag_event_time
from mytable t
) t
感谢所有好的提示,它们为我指明了最终解决方案的方向,即:
select a.*, sum (case when a.ddif <= 1 then 0 else 1 end) over (partition by user_name order by event_time) as acc_rnk
from (
select *, date_diff('second',lag(event_time) OVER (PARTITION BY user_name ORDER BY event_time),event_time) as ddif
from tmp
) a
我正在努力寻找一种使用 SQL 对事件进行排名的方法。
目标是每当事件发生超过 delta
秒(例如 1 秒)之前的观察时增加排名。到目前为止,我的尝试如下所示:
select a.event_time, a.user_name, a.object_name, a.rnk, case when a.ddif <= 1000 then 0 else 1 end as new_query,
case when a.ddif <= 1000 then 0 else rnk end as new_rnk
from (
select *, rank() OVER (PARTITION BY user_name ORDER BY event_time) AS rnk,
date_diff('second',lag(event_time) OVER (PARTITION BY user_name ORDER BY event_time),event_time) as ddif
from tmp
) a
但它只给了我以下结果,我仍然不知道如何实现 yellow
中的结果(它们中的任何一个都对我来说很完美)。
如果有任何帮助,我将不胜感激。
请注意:我正在使用 Presto DB,因此我仅限于此查询引擎。
使用lag()
和累积总和来定义组。然后分配行号:
select t.*,
row_number() over (partition by user_name, grp order by event_time) as seqnum
from (select t.*,
sum(case when prev_et > event_time - interval '1' second
then 0 else 1
end) over (partition by user_name order by event_time) as grp
from (select t.*,
lag(event_time) over (partition by user_name order by event_time) as prev_et
from tmp t
) t
) t;
您可以使用 lag()
和 window sum()
:
select
t.*,
sum(case when event_time <= lag_event_time + interval '1' second then 0 else 1 end) rnk
from (
select
t.*,
lag(event_time) over(order by event_time partition by user_name) lag_event_time
from mytable t
) t
感谢所有好的提示,它们为我指明了最终解决方案的方向,即:
select a.*, sum (case when a.ddif <= 1 then 0 else 1 end) over (partition by user_name order by event_time) as acc_rnk
from (
select *, date_diff('second',lag(event_time) OVER (PARTITION BY user_name ORDER BY event_time),event_time) as ddif
from tmp
) a