如何使用 BigQuery 分析函数计算时间戳行之间的时间?

How to use BigQuery Analytic Functions to calculate time between timestamped rows?

我有一个代表分析事件的数据集,例如:

Row     timestamp   account_id  type     
1   2018-11-14 21:05:40 UTC abc start    
2   2018-11-14 21:05:40 UTC xyz another_type     
3   2018-11-26 22:01:19 UTC xyz start    
4   2018-11-26 22:01:23 UTC abc start    
5   2018-11-26 22:01:29 UTC xyz some_other_type
11  2018-11-26 22:13:58 UTC xyz start
...

有一些account_ids。我需要找到每个 account_id.

start 条记录之间的平均时间

我正在尝试使用所描述的分析函数 here。我的最终目标是 table 像:

Row     account_id     avg_time_between_events_mins
1     xyz     53
2     abc     47
3     pqr     65
...

我最好的尝试——基于 this post——看起来像这样:

WITH
  events AS (
  SELECT
    COUNTIF(type = 'start' AND account_id='abc') OVER (ORDER BY timestamp) as diff,
    timestamp
  FROM
    `myproject.dataset.events`
  WHERE
    account_id='abc')
SELECT
  min(timestamp) AS start_time,
  max(timestamp) AS next_start_time,
  ABS(timestamp_diff(min(timestamp), max(timestamp), MINUTE)) AS minutes_between
FROM
  events
GROUP BY
  diff

这会计算每个 start 事件与特定 account_id.[=24 的下一个 start 事件之前的最后一个非 start 事件之间的时间=]

我试过像这样使用 PARTITIONWINDOW FRAME CLAUSE

WITH
  events AS (
  SELECT
    COUNT(*) OVER (PARTITION BY account_id ORDER BY timestamp ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING) as diff,
    timestamp
  FROM
    `myproject.dataset.events`
  WHERE
    type = 'start')
SELECT
  min(timestamp) AS start_time,
  max(timestamp) AS next_start_time,
  ABS(timestamp_diff(min(timestamp), max(timestamp), MINUTE)) AS minutes_between
FROM
  events
GROUP BY
  diff

但是我得到了一个废话结果table。谁能告诉我如何编写和推理这样的查询?

你真的不需要解析函数:

select timestamp_diff(min(timestamp), max(timestamp), MINUTE)) / nullif(count(*) - 1, 0)
from `myproject.dataset.events`
where type = 'start'
group by account_id;

这是最近的时间戳减去最旧的,除以启动次数减一。这是开始之间的平均值。