在 SQL 中投放漏斗
Drop Off Funnel in SQL
我有一个 table 有 user_seq_id 并且用户在程序中活跃的天数。我想了解下客渠道。比如有多少用户在第 0 天 (100%) 和第 1、2 天等活跃。
输入table:
create table test (
user_seq_id int ,
NoOfDaysUserWasActive int
);
insert into test (user_seq_id , NoOfDaysUserWasActive)
values (13451, 2), (76453, 1), (22342, 3), (11654, 0),
(54659, 2), (64420, 1), (48906, 5);
我想要 Day、ActiveUsers 和这些用户的 % Distribution。
一种方法根本不使用 window 函数。只是天数和汇总的列表:
select v.day, count(t.user_seq_id),
count(t.user_seq_id) / c.cnt as ratio
from (select 0 as day union all select 1 union all select 2 union all select 3 union all select 4 union all select 5
) v(day) left join
test t
on v.day <= t.NoOfDaysUserWasActive cross join
(select count(*) as cnt from test) c
group by v.day, c.cnt
order by v.day asc;
Here 是一个 db<>fiddle.
提及 window 函数表明您在想:
select NoOfDaysUserWasActive,
sum(count(*)) over (order by NoOfDaysUserWasActive desc) as cnt,
sum(count(*)) over (order by NoOfDaysUserWasActive desc) / sum(count(*)) over () as ratio
from test
group by NoOfDaysUserWasActive
order by NoOfDaysUserWasActive
问题是这不会“填写”原始数据中未明确显示的日期。如果这不是问题,那么这应该有更好的性能。
我有一个 table 有 user_seq_id 并且用户在程序中活跃的天数。我想了解下客渠道。比如有多少用户在第 0 天 (100%) 和第 1、2 天等活跃。
输入table:
create table test (
user_seq_id int ,
NoOfDaysUserWasActive int
);
insert into test (user_seq_id , NoOfDaysUserWasActive)
values (13451, 2), (76453, 1), (22342, 3), (11654, 0),
(54659, 2), (64420, 1), (48906, 5);
我想要 Day、ActiveUsers 和这些用户的 % Distribution。
一种方法根本不使用 window 函数。只是天数和汇总的列表:
select v.day, count(t.user_seq_id),
count(t.user_seq_id) / c.cnt as ratio
from (select 0 as day union all select 1 union all select 2 union all select 3 union all select 4 union all select 5
) v(day) left join
test t
on v.day <= t.NoOfDaysUserWasActive cross join
(select count(*) as cnt from test) c
group by v.day, c.cnt
order by v.day asc;
Here 是一个 db<>fiddle.
提及 window 函数表明您在想:
select NoOfDaysUserWasActive,
sum(count(*)) over (order by NoOfDaysUserWasActive desc) as cnt,
sum(count(*)) over (order by NoOfDaysUserWasActive desc) / sum(count(*)) over () as ratio
from test
group by NoOfDaysUserWasActive
order by NoOfDaysUserWasActive
问题是这不会“填写”原始数据中未明确显示的日期。如果这不是问题,那么这应该有更好的性能。