presto sql 查找具有按顺序插入的特定列值的 ID
presto sql to find ids having specific column values inserted in sequence
我有一个table
user_id user_type date_updated
1 Beginner 10/10/2020
1 Moderate 10/11/2020
1 Advanced 10/12/2020
2 Beginner 10/10/2020
2 Moderate 10/11/2020
2 Expert 10/12/2020
2 Advanced 10/13/2020
我正在寻找 sql 以找到 user_ids 和 user_type(初学者- >Moderate->Advanced) 按 date_updated 顺序递增排列。
上面的结果 table 应该是 user_id 1 因为它有 Beginner(10/10/2020) -> Moderate(10/11/2020) -> Advanced(10/12/ 2020)
user_id 2 不合格,因为所有要求的类型都没有相互遵循 Beginner->Moderate->Expert->高级
一种方法是在 having
:
中使用条件逻辑进行聚合
select user_id
from t
group by user_id
having max(case when user_type = 'Beginner' then date_updated end) < max(case when user_type = 'Moderate' then date_updated end) and
max(case when user_type = 'Moderate' then date_updated end) < max(case when user_type = 'Advanced' then date_updated end);
编辑:
根据修改后的问题,使用lag()
。假设给定用户没有重复的用户类型:
select user_id
from (select t.*,
lag(user_type) over (partition by user_id order by date_updated) as prev_user_type,
lag(user_type, 2) over (partition by user_id order by date_updated) as prev2_user_type
from t
) t
where prev2_user_type = 'Beginner' and
prev_user_type = 'Moderate' and
user_type = 'Advanced'
一种方法使用聚合:
select user_id
from mytable
group by user_id
having array_agg(user_type order by date_updated) = array['Beginner', 'Moderate', 'Advanced']
我有一个table
user_id user_type date_updated
1 Beginner 10/10/2020
1 Moderate 10/11/2020
1 Advanced 10/12/2020
2 Beginner 10/10/2020
2 Moderate 10/11/2020
2 Expert 10/12/2020
2 Advanced 10/13/2020
我正在寻找 sql 以找到 user_ids 和 user_type(初学者- >Moderate->Advanced) 按 date_updated 顺序递增排列。
上面的结果 table 应该是 user_id 1 因为它有 Beginner(10/10/2020) -> Moderate(10/11/2020) -> Advanced(10/12/ 2020)
user_id 2 不合格,因为所有要求的类型都没有相互遵循 Beginner->Moderate->Expert->高级
一种方法是在 having
:
select user_id
from t
group by user_id
having max(case when user_type = 'Beginner' then date_updated end) < max(case when user_type = 'Moderate' then date_updated end) and
max(case when user_type = 'Moderate' then date_updated end) < max(case when user_type = 'Advanced' then date_updated end);
编辑:
根据修改后的问题,使用lag()
。假设给定用户没有重复的用户类型:
select user_id
from (select t.*,
lag(user_type) over (partition by user_id order by date_updated) as prev_user_type,
lag(user_type, 2) over (partition by user_id order by date_updated) as prev2_user_type
from t
) t
where prev2_user_type = 'Beginner' and
prev_user_type = 'Moderate' and
user_type = 'Advanced'
一种方法使用聚合:
select user_id
from mytable
group by user_id
having array_agg(user_type order by date_updated) = array['Beginner', 'Moderate', 'Advanced']