SQL BigQuery - 我如何为每个订单计算该客户在前一个完整的 12 个月期间的订单数量

SQL BigQuery - how can I calculate for each order the the count of that customers' orders in the preceding full 12 month period

对于每条订单记录,我想计算之前完整 12 个月期间(不包括订单月份)来自该客户的先前订单数。

我可以在没有日期限制的情况下进行计数(代码如下)。

但我不知道如何将计数限制在 'rolling' 日期范围内。

对于我遗漏的任何建议,我将不胜感激!

With input_data AS (

SELECT '#1238' as order_id, DATE('2021-12-15') as order_date, 'c12345' as cust_id, 18 as order_value
UNION ALL SELECT '#1201' as order_id, DATE('2021-10-10') as order_date, 'c12345' as cust_id, 18 as order_value
UNION ALL SELECT '#1198' as order_id, DATE('2021-07-05') as order_date, 'c12345' as cust_id, 20 as order_value
UNION ALL SELECT '#1134' as order_id, DATE('2020-10-15') as order_date, 'c12345' as cust_id, 10 as order_value
UNION ALL SELECT '#1112' as order_id, DATE('2019-08-10') as order_date, 'c12345' as cust_id, 5 as order_value
UNION ALL SELECT '#1234' as order_id, DATE('2021-07-05') as order_date, 'c11111' as cust_id, 118 as order_value
UNION ALL SELECT '#1294' as order_id, DATE('2021-01-05') as order_date, 'c11111' as cust_id, 68 as order_value
UNION ALL SELECT '#1290' as order_id, DATE('2021-01-01') as order_date, 'c11111' as cust_id, 82 as order_value
UNION ALL SELECT '#1284' as order_id, DATE('2020-01-15') as order_date, 'c22222' as cust_id, 98 as order_value)

SELECT
order_id
, cust_id
, order_date
, prev_12m_orders
FROM (
    SELECT order_id, cust_id, order_date,
COUNT(order_id) OVER(PARTITION BY cust_id ORDER BY order_date DESC ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) AS prev_12m_orders
FROM input_data
) 

-- Limit prev_12m_orders to the range of the last complete 12 month period, something like 
-- order_date < DATE_SUB(DATE_TRUNC(order_date, MONTH), INTERVAL 1 DAY) AS last_day_prev_mth
-- order_date < DATE_SUB(DATE_TRUNC(order_date, MONTH), INTERVAL 12 MONTH) AS first_day_full_12m_ago
-- If possible it should return NULL where there are no orders more than 12 months prior to the order being evaluated   

这会生成以下输出(带有对预期值的注释)。

| Row | order_id | cust_id | order_date | prev_12m_orders | Comment                                |
|-----|----------|---------|------------|-----------------|----------------------------------------|
| 1   | #1234    | c11111  | 2021-07-05 |               2 | Correct                                |
| 2   | #1294    | c11111  | 2021-01-05 |               1 | Should be 0 as order in same month     |
| 3   | #1290    | c11111  | 2021-01-01 |               0 | Correct                                |
| 4   | #1238    | c12345  | 2021-12-15 |               4 | Should be 2 as last orders out of range |
| 5   | #1201    | c12345  | 2021-10-10 |               3 | Should be 2 as last orders out of range |
| 6   | #1198    | c12345  | 2021-07-05 |               2 | Should be 1 as last order out of range |
| 7   | #1134    | c12345  | 2020-10-15 |               1 | Should be 0 as last order out of range |
| 8   | #1112    | c12345  | 2019-08-10 |               0 | Should be NULL as >12m prior orders    |
| 9   | #1284    | c22222  | 2020-01-15 |               0 | Should be NULL as >12m prior orders    |

非常感谢任何建议...

考虑以下方法

select *, 
  count(order_id) over last_12m as prev_12m_orders
from input_data
window last_12m as (
  partition by cust_id 
  order by cast(format_date('%Y%m', order_date ) as int64)
  range between 100 preceding  and 1 preceding
)              

如果应用于您问题中的示例数据 - 输出为

null 满足您的第二个要求 - 使用下面的

select * except(prev_12m_orders, prior_12m_orders),
  if(prev_12m_orders = 0, if(prior_12m_orders = 1, null, 0), prev_12m_orders) as prev_12m_orders
from (
  select *, 
    count(order_id) over last_12m as prev_12m_orders, 
    count(order_id) over prior_12m as prior_12m_orders
  from input_data
  window last_12m as (
    partition by cust_id 
    order by cast(format_date('%Y%m', order_date ) as int64)
    range between 100 preceding  and 1 preceding
  ), prior_12m as (
    partition by cust_id 
    order by cast(format_date('%Y%m', order_date ) as int64)
    range between unbounded preceding  and current row
  )
)

有输出