SQL 从标志中获取 2 个相邻的动作

SQL to get 2 adjacent actions from the flag

祝你一切顺利!

我有一个虚拟数据如下。

我想从每个用户的标志中获取 2 个相邻的操作。

这是描述我的想法的图表。

这是我想要的:

如何实现 SQL(我使用 Google Bigquery)? 希望有人能点亮我。百万感谢!

试试导航功能LAG:

WITH finishers AS
 (SELECT 'Sophia Liu' as name,
  TIMESTAMP '2016-10-18 2:51:45' as finish_time,
  'F30-34' as division
  UNION ALL SELECT 'Lisa Stelzner', TIMESTAMP '2016-10-18 2:54:11', 'F35-39'
  UNION ALL SELECT 'Nikki Leith', TIMESTAMP '2016-10-18 2:59:01', 'F30-34'
  UNION ALL SELECT 'Lauren Matthews', TIMESTAMP '2016-10-18 3:01:17', 'F35-39'
  UNION ALL SELECT 'Desiree Berry', TIMESTAMP '2016-10-18 3:05:42', 'F35-39'
  UNION ALL SELECT 'Suzy Slane', TIMESTAMP '2016-10-18 3:06:24', 'F35-39'
  UNION ALL SELECT 'Jen Edwards', TIMESTAMP '2016-10-18 3:06:36', 'F30-34'
  UNION ALL SELECT 'Meghan Lederer', TIMESTAMP '2016-10-18 3:07:41', 'F30-34'
  UNION ALL SELECT 'Carly Forte', TIMESTAMP '2016-10-18 3:08:58', 'F25-29'
  UNION ALL SELECT 'Lauren Reasoner', TIMESTAMP '2016-10-18 3:10:14', 'F30-34')
SELECT name,
  finish_time,
  division,
  LAG(name)
    OVER (PARTITION BY division ORDER BY finish_time ASC) AS preceding_runner
FROM finishers;

+-----------------+-------------+----------+------------------+
| name            | finish_time | division | preceding_runner |
+-----------------+-------------+----------+------------------+
| Carly Forte     | 03:08:58    | F25-29   | NULL             |
| Sophia Liu      | 02:51:45    | F30-34   | NULL             |
| Nikki Leith     | 02:59:01    | F30-34   | Sophia Liu       |
| Jen Edwards     | 03:06:36    | F30-34   | Nikki Leith      |
| Meghan Lederer  | 03:07:41    | F30-34   | Jen Edwards      |
| Lauren Reasoner | 03:10:14    | F30-34   | Meghan Lederer   |
| Lisa Stelzner   | 02:54:11    | F35-39   | NULL             |
| Lauren Matthews | 03:01:17    | F35-39   | Lisa Stelzner    |
| Desiree Berry   | 03:05:42    | F35-39   | Lauren Matthews  |
| Suzy Slane      | 03:06:24    | F35-39   | Desiree Berry    |
+-----------------+-------------+----------+------------------+

你似乎想要lag()。我会将“动作序列”保留为两个单独的列:

select user, prev_action, action, flag
from (select t.*,
             lag(action) over (partition by user order by sequence) as prev_action
      from t
     ) t
where prev_action is not null;

考虑以下选项

select user, actions.action_sequence, flag  from (
  select *, (
    select as struct count(1) actions_count,
      string_agg(action, ' >> ' order by sequence) action_sequence
    from unnest(arr)
    ) actions
  from (
    select *, array_agg(struct(action, sequence)) 
      over(partition by user order by sequence desc range between current row and 1 following) arr
    from src_table
  ) 
)
where flag != '' 
and actions.actions_count = 2
# order by user, sequence      

如果应用于您问题中的示例数据 - 输出为

注意 - 以上解决方案可重复用于您要分析的任意数量的序列 - 不像其他答案中的解决方案只锁定两个

在此解决方案中 - 您只需将下面几行中的数字(分别为 1 和 2)更改为您需要的任何内容,无需进行其他更改:o)

over(partition by user order by sequence desc range between current row and 1 following) arr               

and actions.actions_count = 2

例如,如果您分别将它们更改为 2 和 3 - 输出将为