不想在过滤聚合中重复计算

Don't want to double count in Filtered Aggregation

示例数据:

shopper_id last_purchase_timestamp active_p30 active_p60 active_over_p90
1 2022-03-02 1:20:00 TRUE TRUE TRUE
2 2022-03-01 1:30:00 TRUE TRUE TRUE
3 2022-02-28 1:24:03 TRUE TRUE TRUE
4 2022-02-02 21:22:26 FALSE TRUE TRUE

我想统计购物者在过去 30 天(从 3 月 5 日开始)、过去 60 天等期间是否活跃(如上次购买)

我的目标是找出有多少购物者在过去 30 天内购买了他们最后一件商品,有多少购物者在过去 60 天内购买了他们最后一件商品等等。但是我不想重复计算。

我尝试过的:

AS total_active_p30,

count(*) FILTER (where last_purchase_timestamp >= DATE '2022-03-05' - INTERVAL '60' day) 
AS total_active_p60,

count(*) FILTER (where last_purchase_timestamp >= DATE '2022-03-05' - INTERVAL '90' day) AS 
total_active_p90 

结果:

total_active_p30 total_active_p60 total_active_p90
3 4 4

然而,这导致它重复计算。我怎样才能防止它重复计算?计数总数应为 4。

我理想的输出是:

total_active_p30 total_active_p60 total_active_p90
3 1 0

提前谢谢大家!我正在使用 Trino!

将上限和下限都添加到过滤器中,使它们不相交。沿着这条线:

-- sample data
WITH dataset (last_purchase_timestamp) AS (
    VALUES (timestamp '2022-03-02 1:20:00'),
        (timestamp '2022-03-01 1:30:00'),
        (timestamp '2022-02-28 1:24:03'),
        (timestamp '2022-02-02 21:22:26')
)

-- query
select count_if(last_purchase_timestamp >= DATE '2022-03-05' - INTERVAL '30' day) total_active_p30,
    count_if(last_purchase_timestamp >= DATE '2022-03-05' - INTERVAL '60' day and last_purchase_timestamp < DATE '2022-03-05' - INTERVAL '30' day) total_active_p60,
    count_if(last_purchase_timestamp >= DATE '2022-03-05' - INTERVAL '90' day and last_purchase_timestamp < DATE '2022-03-05' - INTERVAL '60' day) total_active_p90 
from dataset

输出:

total_active_p30 total_active_p60 total_active_p90
3 1 0

您的查询逻辑条件不正确。因为产生这个 >= DATE 2022-03-05 - INTERVAL 90 day 条件的数据总是有产生这个 >= DATE 2022-03-05 - INTERVAL 60 day 条件的数据。为此,我们必须编写查询:

count(*) filter (where last_purchase_timestamp >= ('2022-03-05'::date - INTERVAL '30' day)) 
as total_active_p30,

count(*) filter (where last_purchase_timestamp >= ('2022-03-05'::date - INTERVAL '60' day)
                            and last_purchase_timestamp < ('2022-03-05'::date - INTERVAL '30' day)) 
as total_active_p60,

count(*) filter (where last_purchase_timestamp >= ('2022-03-05'::date - INTERVAL '90' day)
                        and last_purchase_timestamp < ('2022-03-05'::date - INTERVAL '60' day)) 
as total_active_p90