雪花 window 函数 last_value 和最大值
Snowflake window function last_value and max
我有一个这样的 table :
我想为每个 user_key 获取每个日期/月/工作日/周的顶部 total_listened
我想我需要使用 window 功能
我可以获得不同的日期格式:
MONTH(stream_date) for months
WEEKDAY(stream_date) for weekday
WEEK(stream_date) for week
我试过这个:
select
MAX(vals.total_listened_per_day) as RECORD_STREAM_DAY_TIME,
MAX(vals.total_listened_per_month) as RECORD_STREAM_MONTH_TIME,
MAX(vals.total_listened_per_week) as RECORD_STREAM_WEEK_TIME,
MAX(vals.most_active_weekday) as MOST_ACTIVE_WEEKDAY_TIME
last_value(days.date) over (partition by user_key order by days.total_listened) as RECORD_STREAMDAY,
from
(
select user_key, stream_date as date,
sum(st.length_listened) over (partition by user_key, stream_date) as total_listened_per_day,
sum(st.length_listened) over (partition by user_key, MONTH(stream_date)) as total_listened_per_month,
sum(st.length_listened) over (partition by user_key, WEEK(stream_date)) as total_listened_per_week,
sum(st.length_listened) over (partition by user_key, DAYNAME(stream_date)) as most_active_weekday
group by 1,2
.....
)
用于获取金额(以 _TIME 结尾的变量),但不适用于获取特定日期/月份....(末尾没有 _TIME 的变量,例如 RECORD_STREAMDAY) ,这是因为 group by ,它按 stream_date 而不是 month(stream_date) 分组,例如,我不知道如果没有每个 [=15 的 doin 子查询我怎么能做到这一点=]
我觉得你想要的逻辑是:
select user_key,
max(total_listened_per_day ) as max_total_listened_per_day
max(total_listened_per_week ) as max_total_listened_per_week,
max(total_listened_per_month) as max_total_listened_per_month,
max(case when rn_day = 1 then date_trunc('day', stream_date) end) as most_active_day,
max(case when rn_week = 1 then date_trunc('week', stream_date) end) as most_active_week,
max(case when rn_month = 1 then date_trunc('month', stream_date) end) as most_active_month
from (
select t.*,
rank() over(partition by user_key order by total_listened_per_day desc) as rn_day,
rank() over(partition by user_key order by total_listened_per_week desc) as rn_week,
rank() over(partition by user_key order by total_listened_per_month desc) as rn_month
from (
select t.*
sum(st.length_listened) over (partition by user_key, date_trunc('day', stream_date)) as total_listened_per_day,
sum(st.length_listened) over (partition by user_key, date_trunc('week', stream_date)) as total_listened_per_week
sum(st.length_listened) over (partition by user_key, date_trunc('month', stream_date)) as total_listened_per_month
from mytable t
) t
) t
group by user_key
最内层子查询计算每天、每周和每月收听时间的window总和。下一个子查询使用该信息对记录进行排名。最后,外层查询使用条件聚合,带上对应的durations和periods。如果有联系,则选择最近的时期。
我有一个这样的 table :
我想为每个 user_key 获取每个日期/月/工作日/周的顶部 total_listened
我想我需要使用 window 功能 我可以获得不同的日期格式:
MONTH(stream_date) for months
WEEKDAY(stream_date) for weekday
WEEK(stream_date) for week
我试过这个:
select
MAX(vals.total_listened_per_day) as RECORD_STREAM_DAY_TIME,
MAX(vals.total_listened_per_month) as RECORD_STREAM_MONTH_TIME,
MAX(vals.total_listened_per_week) as RECORD_STREAM_WEEK_TIME,
MAX(vals.most_active_weekday) as MOST_ACTIVE_WEEKDAY_TIME
last_value(days.date) over (partition by user_key order by days.total_listened) as RECORD_STREAMDAY,
from
(
select user_key, stream_date as date,
sum(st.length_listened) over (partition by user_key, stream_date) as total_listened_per_day,
sum(st.length_listened) over (partition by user_key, MONTH(stream_date)) as total_listened_per_month,
sum(st.length_listened) over (partition by user_key, WEEK(stream_date)) as total_listened_per_week,
sum(st.length_listened) over (partition by user_key, DAYNAME(stream_date)) as most_active_weekday
group by 1,2
.....
)
用于获取金额(以 _TIME 结尾的变量),但不适用于获取特定日期/月份....(末尾没有 _TIME 的变量,例如 RECORD_STREAMDAY) ,这是因为 group by ,它按 stream_date 而不是 month(stream_date) 分组,例如,我不知道如果没有每个 [=15 的 doin 子查询我怎么能做到这一点=]
我觉得你想要的逻辑是:
select user_key,
max(total_listened_per_day ) as max_total_listened_per_day
max(total_listened_per_week ) as max_total_listened_per_week,
max(total_listened_per_month) as max_total_listened_per_month,
max(case when rn_day = 1 then date_trunc('day', stream_date) end) as most_active_day,
max(case when rn_week = 1 then date_trunc('week', stream_date) end) as most_active_week,
max(case when rn_month = 1 then date_trunc('month', stream_date) end) as most_active_month
from (
select t.*,
rank() over(partition by user_key order by total_listened_per_day desc) as rn_day,
rank() over(partition by user_key order by total_listened_per_week desc) as rn_week,
rank() over(partition by user_key order by total_listened_per_month desc) as rn_month
from (
select t.*
sum(st.length_listened) over (partition by user_key, date_trunc('day', stream_date)) as total_listened_per_day,
sum(st.length_listened) over (partition by user_key, date_trunc('week', stream_date)) as total_listened_per_week
sum(st.length_listened) over (partition by user_key, date_trunc('month', stream_date)) as total_listened_per_month
from mytable t
) t
) t
group by user_key
最内层子查询计算每天、每周和每月收听时间的window总和。下一个子查询使用该信息对记录进行排名。最后,外层查询使用条件聚合,带上对应的durations和periods。如果有联系,则选择最近的时期。