查询以查找付费客户和流失客户的数量?

Query to find the number of paying customers and churned customers?

我有一个 table paid_users 看起来像这样:

http://sqlfiddle.com/#!15/d25ba

我正在尝试确定按年分组的付费客户和按年分组的流失客户。本质上,有付款人和使用者。付款人是为该特定用户付款的人。如果没有payment_stop_date,则表示支付方仍在为用户支付。 payment_stop_date表示if/when支付方已经停止为用户支付。

我要查找其中查询结果应为的付费客户数:

Month-Year | New Paying Customers | Churned Paying Customers
------------------------------------------------------------
11-2014    | 1                    |
12-2014    |                      | 1
01-2015    | 1                    |
04-2015    |                      |
06-2015    | 2                    |
07-2015    | 1                    |
10-2015    |                      | 1

查看payor_id 3453,她在11-2014 年开始支付user_id 3182,所以她将被纳入11-2014 组。但是,她在 12-2014 年停止为两个用户付费,因此被包括在那个流失的 12-2014 组中。如果付款人已完全停止向我们付款(即,他们本可以为一个人付款然后取消。或者在这种情况下,payor_id 3453 为 2 个用户付款然后取消),则付款人被视为流失的付款客户。 Payor_3453 然后在 01-2015 开始为 user_id 4716 付款,因此她随后被包含在 01-2015 组中。

我很难为此编写查询,因为它不一定是不同的 payor_id 因为 payor_id 3453 被认为是新的付费客户两次

不确定我是否理解正确:对于每个月,您想知道有多少客户开始为他们的第一个用户付费以及有多少客户停止为他们的最后一个用户付费?

解决方案看起来比较复杂,但也许毕竟不是那么容易。

with months as 
(
    select * from 
    generate_series('2014-06-01', now() at time zone 'utc', interval '1 month') as month
    cross join paid_users
)
, sums as
(
    select month, payor_id, joiners, leavers, sum(net) over (partition by payor_id order by month) 
    from
    (
        select month, payor_id, joiners, leavers, coalesce(joiners,0) - coalesce(leavers, 0) as net
        from
        (
            select payor_id, month, count(*) as joiners
            from months
            where payment_start_date >= month
            and payment_start_date < month + interval '1 month'
            group by month, payor_id
        ) as t
        full join
        (
            select payor_id, month, count(*) as leavers
            from months
            where payment_stop_date >= month
            and payment_stop_date < month + interval '1 month'
            group by month, payor_id
        ) as u
        using (month, payor_id)
    ) as v
)

select * from sums
order by payor_id, sum

以上应该是每个客户的付费用户总数

        month        | payor_id | joiners | leavers | sum 
---------------------+----------+---------+---------+-----
 2014-06-01 00:00:00 |     1725 |       1 |         |   1
 2014-06-01 00:00:00 |     1929 |       1 |         |   1
 2015-10-01 00:00:00 |     1929 |         |       1 |   0
 2014-06-01 00:00:00 |     1986 |       1 |         |   1
 2014-11-01 00:00:00 |     3453 |       2 |         |   2
 2014-12-01 00:00:00 |     3453 |         |       2 |   0
 2015-01-01 00:00:00 |     3453 |       1 |         |   1
 2015-03-01 00:00:00 |     3453 |       1 |         |   2
 2015-04-01 00:00:00 |     3453 |       2 |       1 |   3
 2015-05-01 00:00:00 |     3453 |         |       1 |   2
 2015-06-01 00:00:00 |     3453 |         |       1 |   1
 2015-10-01 00:00:00 |     3453 |       1 |         |   2
 2015-07-01 00:00:00 |     6499 |       1 |         |   1
 2015-08-01 00:00:00 |     6499 |       3 |         |   4
 2015-10-01 00:00:00 |     6499 |         |       1 |   3
 2015-11-01 00:00:00 |     6499 |         |       1 |   2

所以新客户是总和从 0 变为非零的客户,流失客户是总和为 0 的客户?

select month, new, churned from
(
    (
        select month, count(*) as churned
        from sums
        where sum = 0
        group by month
    ) as l
    full join
    (
        select month, count(*) as new
        from (
            select month, payor_id, sum, coalesce(lag(sum) over (partition by payor_id order by month), 0) as prev_sum
            from sums
            order by payor_id, month
        ) as t
        where prev_sum = 0 and sum > 0
        group by month
    ) as r
    using (month)
)
order by month

产出

        month        | new | churned 
---------------------+-----+---------
 2014-06-01 00:00:00 |   3 |        
 2014-11-01 00:00:00 |   1 |        
 2014-12-01 00:00:00 |     |       1
 2015-01-01 00:00:00 |   1 |        
 2015-07-01 00:00:00 |   1 |        
 2015-10-01 00:00:00 |     |       1

希望这对您有所帮助。如果有人知道更简单的方法,我会很高兴听到。