确定距离下一个 ID 有多远

Determining how far away is the next ID

所以我有一些数据,子集如下:

ID  data start_time 
001    X 2021-12-29 10:54:12.429 +0000
002    Y 2022-01-16 05:07:55.708 +0000 
003    Y 2021-12-31 12:25:12.980 +0000
002    A 2022-01-03 12:49:41.866 +0000
001    A 2021-12-30 16:32:13.736 +0000
001    A 2022-01-17 10:10:10.736 +0000

我想以分钟为单位确定数据帧中给定 ID 下一次出现 之间的时间差,顺序为 start_time .因此,如果 ID 出现在 12:00 和 12:01,我希望 ID 显示下一个条目的时间以及以分钟为单位的差异,使用 SQL/Snowflake。首选 CTE。

应添加以下字段:

预期输出:

ID  data start_time                       next_timestamp                 time_diff  entry_order
001    X 2021-12-29 10:54:12.429 +0000    2021-12-30 16:32:13.736 +0000  1778       1
001    A 2021-12-30 16:32:13.736 +0000    2022-01-17 10:10:10.736 +0000  25537      2
003    Y 2021-12-31 12:25:12.980 +0000    NULL                           NULL       1
002    A 2022-01-03 12:49:41.866 +0000    2022-01-16 05:07:55.708 +0000  18258      1
002    Y 2022-01-16 05:07:55.708 +0000    NULL                           NULL       2
001    A 2022-01-17 10:10:10.736 +0000    NULL                           NULL       3

注意,结果输出按时间戳升序排列。

使用 LEADDATEDIFFROW_NUMBER

SELECT *,
   LEAD(start_time) OVER(PARTITITON BY ID ORDER BY start_time) AS next_timestamp,
   DATEDIFF(seconds, start_time, next_timestamp) SA time_difference,
   ROW_NUMBER() OVER(PARTITITON BY ID ORDER BY start_time) AS entry_order
FROM tab

LEAD 函数可用于查找每个 ID 的下一个 start_time。

并且 ROW_NUMBER 函数可以 return 每个 ID 的唯一序列号。

SELECT *
, LEAD(start_time) OVER (PARTITION BY ID ORDER BY start_time) AS next_timestamp
, DATEDIFF(minute, start_time, LEAD(start_time) OVER (PARTITION BY ID ORDER BY start_time)) AS time_diff
, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY start_time) AS entry_order
FROM your_table
ORDER BY start_time