SQL 从标志中获取 2 个相邻的动作
SQL to get 2 adjacent actions from the flag
祝你一切顺利!
我有一个虚拟数据如下。
我想从每个用户的标志中获取 2 个相邻的操作。
这是描述我的想法的图表。
这是我想要的:
如何实现 SQL(我使用 Google Bigquery)?
希望有人能点亮我。百万感谢!
试试导航功能LAG:
WITH finishers AS
(SELECT 'Sophia Liu' as name,
TIMESTAMP '2016-10-18 2:51:45' as finish_time,
'F30-34' as division
UNION ALL SELECT 'Lisa Stelzner', TIMESTAMP '2016-10-18 2:54:11', 'F35-39'
UNION ALL SELECT 'Nikki Leith', TIMESTAMP '2016-10-18 2:59:01', 'F30-34'
UNION ALL SELECT 'Lauren Matthews', TIMESTAMP '2016-10-18 3:01:17', 'F35-39'
UNION ALL SELECT 'Desiree Berry', TIMESTAMP '2016-10-18 3:05:42', 'F35-39'
UNION ALL SELECT 'Suzy Slane', TIMESTAMP '2016-10-18 3:06:24', 'F35-39'
UNION ALL SELECT 'Jen Edwards', TIMESTAMP '2016-10-18 3:06:36', 'F30-34'
UNION ALL SELECT 'Meghan Lederer', TIMESTAMP '2016-10-18 3:07:41', 'F30-34'
UNION ALL SELECT 'Carly Forte', TIMESTAMP '2016-10-18 3:08:58', 'F25-29'
UNION ALL SELECT 'Lauren Reasoner', TIMESTAMP '2016-10-18 3:10:14', 'F30-34')
SELECT name,
finish_time,
division,
LAG(name)
OVER (PARTITION BY division ORDER BY finish_time ASC) AS preceding_runner
FROM finishers;
+-----------------+-------------+----------+------------------+
| name | finish_time | division | preceding_runner |
+-----------------+-------------+----------+------------------+
| Carly Forte | 03:08:58 | F25-29 | NULL |
| Sophia Liu | 02:51:45 | F30-34 | NULL |
| Nikki Leith | 02:59:01 | F30-34 | Sophia Liu |
| Jen Edwards | 03:06:36 | F30-34 | Nikki Leith |
| Meghan Lederer | 03:07:41 | F30-34 | Jen Edwards |
| Lauren Reasoner | 03:10:14 | F30-34 | Meghan Lederer |
| Lisa Stelzner | 02:54:11 | F35-39 | NULL |
| Lauren Matthews | 03:01:17 | F35-39 | Lisa Stelzner |
| Desiree Berry | 03:05:42 | F35-39 | Lauren Matthews |
| Suzy Slane | 03:06:24 | F35-39 | Desiree Berry |
+-----------------+-------------+----------+------------------+
你似乎想要lag()
。我会将“动作序列”保留为两个单独的列:
select user, prev_action, action, flag
from (select t.*,
lag(action) over (partition by user order by sequence) as prev_action
from t
) t
where prev_action is not null;
考虑以下选项
select user, actions.action_sequence, flag from (
select *, (
select as struct count(1) actions_count,
string_agg(action, ' >> ' order by sequence) action_sequence
from unnest(arr)
) actions
from (
select *, array_agg(struct(action, sequence))
over(partition by user order by sequence desc range between current row and 1 following) arr
from src_table
)
)
where flag != ''
and actions.actions_count = 2
# order by user, sequence
如果应用于您问题中的示例数据 - 输出为
注意 - 以上解决方案可重复用于您要分析的任意数量的序列 - 不像其他答案中的解决方案只锁定两个
在此解决方案中 - 您只需将下面几行中的数字(分别为 1 和 2)更改为您需要的任何内容,无需进行其他更改:o)
over(partition by user order by sequence desc range between current row and 1 following) arr
和
and actions.actions_count = 2
例如,如果您分别将它们更改为 2 和 3 - 输出将为
祝你一切顺利!
我有一个虚拟数据如下。
我想从每个用户的标志中获取 2 个相邻的操作。
这是描述我的想法的图表。
这是我想要的:
如何实现 SQL(我使用 Google Bigquery)? 希望有人能点亮我。百万感谢!
试试导航功能LAG:
WITH finishers AS
(SELECT 'Sophia Liu' as name,
TIMESTAMP '2016-10-18 2:51:45' as finish_time,
'F30-34' as division
UNION ALL SELECT 'Lisa Stelzner', TIMESTAMP '2016-10-18 2:54:11', 'F35-39'
UNION ALL SELECT 'Nikki Leith', TIMESTAMP '2016-10-18 2:59:01', 'F30-34'
UNION ALL SELECT 'Lauren Matthews', TIMESTAMP '2016-10-18 3:01:17', 'F35-39'
UNION ALL SELECT 'Desiree Berry', TIMESTAMP '2016-10-18 3:05:42', 'F35-39'
UNION ALL SELECT 'Suzy Slane', TIMESTAMP '2016-10-18 3:06:24', 'F35-39'
UNION ALL SELECT 'Jen Edwards', TIMESTAMP '2016-10-18 3:06:36', 'F30-34'
UNION ALL SELECT 'Meghan Lederer', TIMESTAMP '2016-10-18 3:07:41', 'F30-34'
UNION ALL SELECT 'Carly Forte', TIMESTAMP '2016-10-18 3:08:58', 'F25-29'
UNION ALL SELECT 'Lauren Reasoner', TIMESTAMP '2016-10-18 3:10:14', 'F30-34')
SELECT name,
finish_time,
division,
LAG(name)
OVER (PARTITION BY division ORDER BY finish_time ASC) AS preceding_runner
FROM finishers;
+-----------------+-------------+----------+------------------+
| name | finish_time | division | preceding_runner |
+-----------------+-------------+----------+------------------+
| Carly Forte | 03:08:58 | F25-29 | NULL |
| Sophia Liu | 02:51:45 | F30-34 | NULL |
| Nikki Leith | 02:59:01 | F30-34 | Sophia Liu |
| Jen Edwards | 03:06:36 | F30-34 | Nikki Leith |
| Meghan Lederer | 03:07:41 | F30-34 | Jen Edwards |
| Lauren Reasoner | 03:10:14 | F30-34 | Meghan Lederer |
| Lisa Stelzner | 02:54:11 | F35-39 | NULL |
| Lauren Matthews | 03:01:17 | F35-39 | Lisa Stelzner |
| Desiree Berry | 03:05:42 | F35-39 | Lauren Matthews |
| Suzy Slane | 03:06:24 | F35-39 | Desiree Berry |
+-----------------+-------------+----------+------------------+
你似乎想要lag()
。我会将“动作序列”保留为两个单独的列:
select user, prev_action, action, flag
from (select t.*,
lag(action) over (partition by user order by sequence) as prev_action
from t
) t
where prev_action is not null;
考虑以下选项
select user, actions.action_sequence, flag from (
select *, (
select as struct count(1) actions_count,
string_agg(action, ' >> ' order by sequence) action_sequence
from unnest(arr)
) actions
from (
select *, array_agg(struct(action, sequence))
over(partition by user order by sequence desc range between current row and 1 following) arr
from src_table
)
)
where flag != ''
and actions.actions_count = 2
# order by user, sequence
如果应用于您问题中的示例数据 - 输出为
注意 - 以上解决方案可重复用于您要分析的任意数量的序列 - 不像其他答案中的解决方案只锁定两个
在此解决方案中 - 您只需将下面几行中的数字(分别为 1 和 2)更改为您需要的任何内容,无需进行其他更改:o)
over(partition by user order by sequence desc range between current row and 1 following) arr
和
and actions.actions_count = 2
例如,如果您分别将它们更改为 2 和 3 - 输出将为