填补空白,基于事件

Fill the gaps, event based

我正在尝试根据 activity 他们本可以完成的客户流失来计算客户流失,而不是按日期流失,这是正常的。我们有连接到特定主机的事件,在我的示例中,所有事件都由 Alice 主持,但它可以是不同的主机。

关注特定事件的所有人员都应归入一个类别(新的、活跃的、流失的和复活的)。

: 第一次有人关注特定主持人的活动。
活动:再次关注(并且还关注了来自特定主机的最后一个事件)。
Churned:关注者有机会关注但没有。
已复活:已流失的关注者已开始关注之前关注的主机。

declare @events table (event varchar(50), host varchar(50), date date)
declare @eventFollows table (event varchar(50), follower varchar(50))

insert into @events values ('e_1', 'Alice', GETDATE())
insert into @events values ('e_2', 'Alice', GETDATE())
insert into @events values ('e_3', 'Alice', GETDATE())
insert into @events values ('e_4', 'Alice', GETDATE())
insert into @events values ('e_5', 'Alice', GETDATE())

insert into @eventFollows values ('e_1', 'Bob') --new
insert into @eventFollows values ('e_2', 'Bob') --active
--Bob churned 
insert into @eventFollows values ('e_4', 'Megan') --new 
insert into @eventFollows values ('e_5', 'Bob') --resurrected
insert into @eventFollows values ('e_5', 'Megan') --active 

select * from @events
select * from @eventFollows

预期的结果应该是这样的

select 'e_1', 1 as New, 0 as resurrected, 0 as active, 0 as churned --First time Bob follows Alice event
union all
select 'e_2', 0 as New, 0 as resurrected, 1 as active, 0 as churned --Bob follows the next event that Alice host (considered as Active)
union all
select 'e_3', 0 as New, 0 as resurrected, 0 as active, 1 as churned --Bob churns since he does not follow the next event 
union all
select 'e_4', 1 as New, 0 as resurrected, 0 as active, 0 as churned --First time Megan follows Alice event
union all
select 'e_5', 0 as New, 1 as resurrected, 1 as active, 0 as churned --Second time (active) for Megan and Bob is resurrected

我开始查询类似下面的内容,但问题是我没有得到关注者未关注(但本可以关注)的所有事件。

select a.event, follower, date, 
    LAG (a.event,1) over (partition by a.host, ma.follower order by date) as lag, 
    LEAD (a.event,1) over (partition by a.host, ma.follower order by date) as lead,
    LAG (a.event,1) over (partition by a.host order by date) as lagP, 
    LEAD (a.event,1) over (partition by a.host order by date) as leadP
from @events a left join @eventFollows ma on ma.event = a.event order by host, follower, date

有什么想法吗?

这符合您想要的结果

SELECT 
    X.event, X.host, X.date,
    IsNew =  SUM(CASE WHEN X.FirstFollowerEvent = X.event THEN 1 ELSE 0 END),
    IsActive = SUM(CASE WHEN X.lagFollowerEvent = X.lagEvent THEN 1 ELSE 0 END),
    IsChurned = SUM(CASE WHEN X.follower IS NULL THEN 1 ELSE 0 END),
    IsResurrected = SUM(CASE WHEN X.lagFollowerEvent <> X.lagEvent AND X.FirstFollowerEvent IS NOT NULL THEN 1 ELSE 0 END)
FROM 
    (
    select 
        a.event, a.host, ma.follower, a.date, 
        FIRST_VALUE(a.event) over (partition by a.host, ma.follower order by a.date, a.event) as FirstFollowerEvent, 
        LAG (a.event,1) over (partition by a.host, ma.follower order by a.date, a.event) as lagFollowerEvent, 
        LAG (a.event,1) over (partition by a.host order by a.date, a.event) as lagEvent
    FROM
         @events a
         LEFT join 
         @eventFollows ma on a.event = ma.event
    ) X
GROUP BY
    X.event, X.host, X.date
ORDER by 
    X.event, X.host, X.date

这似乎是一种间接的方法,但可以通过检查数字中的间隙来检测孤岛:

;with nrsE as
(
    select *, ROW_NUMBER() over (order by event) rnrE from @events
), nrs as
(
    select  f.*,host, rnrE, ROW_NUMBER() over (partition by f.follower, e.host order by f.event ) rnrF
    from nrsE e
    join @eventFollows f on f.event = e.event
), f as
(
    select host, follower, min(rnrE) FirstE, max(rnrE) LastE, ROW_NUMBER() over (partition by follower, host order by rnrE - rnrF) SeqNr 
    from nrs
    group by host, follower, rnrE - rnrF --difference between rnr-Event and rnr-Follower to detect gaps
), stat as   --from the result above  on there are several options. this example uses getting a 'status' and pivoting on it
(
    select e.event, e.host, case when f.FirstE is null then 'No participants' when f.LastE = e.rnrE - 1 then 'Churned' when rnrE = f.FirstE then case when SeqNr = 1 then 'New' else 'Resurrected' end else 'Active' end Status
    from nrsE e
    left join f on e.rnrE between f.FirstE and f.LastE + 1 and e.host = f.host
)
select p.* from stat pivot(count(Status) for Status in ([New], [Resurrected], [Active],  [Churned])) p

最后 2 个步骤可以简化,但以这种方式获得 'Status' 可能可重复用于其他场景