SQL BigQuery - 我如何为每个订单计算该客户在前一个完整的 12 个月期间的订单数量
SQL BigQuery - how can I calculate for each order the the count of that customers' orders in the preceding full 12 month period
对于每条订单记录,我想计算之前完整 12 个月期间(不包括订单月份)来自该客户的先前订单数。
我可以在没有日期限制的情况下进行计数(代码如下)。
但我不知道如何将计数限制在 'rolling' 日期范围内。
对于我遗漏的任何建议,我将不胜感激!
With input_data AS (
SELECT '#1238' as order_id, DATE('2021-12-15') as order_date, 'c12345' as cust_id, 18 as order_value
UNION ALL SELECT '#1201' as order_id, DATE('2021-10-10') as order_date, 'c12345' as cust_id, 18 as order_value
UNION ALL SELECT '#1198' as order_id, DATE('2021-07-05') as order_date, 'c12345' as cust_id, 20 as order_value
UNION ALL SELECT '#1134' as order_id, DATE('2020-10-15') as order_date, 'c12345' as cust_id, 10 as order_value
UNION ALL SELECT '#1112' as order_id, DATE('2019-08-10') as order_date, 'c12345' as cust_id, 5 as order_value
UNION ALL SELECT '#1234' as order_id, DATE('2021-07-05') as order_date, 'c11111' as cust_id, 118 as order_value
UNION ALL SELECT '#1294' as order_id, DATE('2021-01-05') as order_date, 'c11111' as cust_id, 68 as order_value
UNION ALL SELECT '#1290' as order_id, DATE('2021-01-01') as order_date, 'c11111' as cust_id, 82 as order_value
UNION ALL SELECT '#1284' as order_id, DATE('2020-01-15') as order_date, 'c22222' as cust_id, 98 as order_value)
SELECT
order_id
, cust_id
, order_date
, prev_12m_orders
FROM (
SELECT order_id, cust_id, order_date,
COUNT(order_id) OVER(PARTITION BY cust_id ORDER BY order_date DESC ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) AS prev_12m_orders
FROM input_data
)
-- Limit prev_12m_orders to the range of the last complete 12 month period, something like
-- order_date < DATE_SUB(DATE_TRUNC(order_date, MONTH), INTERVAL 1 DAY) AS last_day_prev_mth
-- order_date < DATE_SUB(DATE_TRUNC(order_date, MONTH), INTERVAL 12 MONTH) AS first_day_full_12m_ago
-- If possible it should return NULL where there are no orders more than 12 months prior to the order being evaluated
这会生成以下输出(带有对预期值的注释)。
| Row | order_id | cust_id | order_date | prev_12m_orders | Comment |
|-----|----------|---------|------------|-----------------|----------------------------------------|
| 1 | #1234 | c11111 | 2021-07-05 | 2 | Correct |
| 2 | #1294 | c11111 | 2021-01-05 | 1 | Should be 0 as order in same month |
| 3 | #1290 | c11111 | 2021-01-01 | 0 | Correct |
| 4 | #1238 | c12345 | 2021-12-15 | 4 | Should be 2 as last orders out of range |
| 5 | #1201 | c12345 | 2021-10-10 | 3 | Should be 2 as last orders out of range |
| 6 | #1198 | c12345 | 2021-07-05 | 2 | Should be 1 as last order out of range |
| 7 | #1134 | c12345 | 2020-10-15 | 1 | Should be 0 as last order out of range |
| 8 | #1112 | c12345 | 2019-08-10 | 0 | Should be NULL as >12m prior orders |
| 9 | #1284 | c22222 | 2020-01-15 | 0 | Should be NULL as >12m prior orders |
非常感谢任何建议...
考虑以下方法
select *,
count(order_id) over last_12m as prev_12m_orders
from input_data
window last_12m as (
partition by cust_id
order by cast(format_date('%Y%m', order_date ) as int64)
range between 100 preceding and 1 preceding
)
如果应用于您问题中的示例数据 - 输出为
用 null
满足您的第二个要求 - 使用下面的
select * except(prev_12m_orders, prior_12m_orders),
if(prev_12m_orders = 0, if(prior_12m_orders = 1, null, 0), prev_12m_orders) as prev_12m_orders
from (
select *,
count(order_id) over last_12m as prev_12m_orders,
count(order_id) over prior_12m as prior_12m_orders
from input_data
window last_12m as (
partition by cust_id
order by cast(format_date('%Y%m', order_date ) as int64)
range between 100 preceding and 1 preceding
), prior_12m as (
partition by cust_id
order by cast(format_date('%Y%m', order_date ) as int64)
range between unbounded preceding and current row
)
)
有输出
对于每条订单记录,我想计算之前完整 12 个月期间(不包括订单月份)来自该客户的先前订单数。
我可以在没有日期限制的情况下进行计数(代码如下)。
但我不知道如何将计数限制在 'rolling' 日期范围内。
对于我遗漏的任何建议,我将不胜感激!
With input_data AS (
SELECT '#1238' as order_id, DATE('2021-12-15') as order_date, 'c12345' as cust_id, 18 as order_value
UNION ALL SELECT '#1201' as order_id, DATE('2021-10-10') as order_date, 'c12345' as cust_id, 18 as order_value
UNION ALL SELECT '#1198' as order_id, DATE('2021-07-05') as order_date, 'c12345' as cust_id, 20 as order_value
UNION ALL SELECT '#1134' as order_id, DATE('2020-10-15') as order_date, 'c12345' as cust_id, 10 as order_value
UNION ALL SELECT '#1112' as order_id, DATE('2019-08-10') as order_date, 'c12345' as cust_id, 5 as order_value
UNION ALL SELECT '#1234' as order_id, DATE('2021-07-05') as order_date, 'c11111' as cust_id, 118 as order_value
UNION ALL SELECT '#1294' as order_id, DATE('2021-01-05') as order_date, 'c11111' as cust_id, 68 as order_value
UNION ALL SELECT '#1290' as order_id, DATE('2021-01-01') as order_date, 'c11111' as cust_id, 82 as order_value
UNION ALL SELECT '#1284' as order_id, DATE('2020-01-15') as order_date, 'c22222' as cust_id, 98 as order_value)
SELECT
order_id
, cust_id
, order_date
, prev_12m_orders
FROM (
SELECT order_id, cust_id, order_date,
COUNT(order_id) OVER(PARTITION BY cust_id ORDER BY order_date DESC ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING) AS prev_12m_orders
FROM input_data
)
-- Limit prev_12m_orders to the range of the last complete 12 month period, something like
-- order_date < DATE_SUB(DATE_TRUNC(order_date, MONTH), INTERVAL 1 DAY) AS last_day_prev_mth
-- order_date < DATE_SUB(DATE_TRUNC(order_date, MONTH), INTERVAL 12 MONTH) AS first_day_full_12m_ago
-- If possible it should return NULL where there are no orders more than 12 months prior to the order being evaluated
这会生成以下输出(带有对预期值的注释)。
| Row | order_id | cust_id | order_date | prev_12m_orders | Comment |
|-----|----------|---------|------------|-----------------|----------------------------------------|
| 1 | #1234 | c11111 | 2021-07-05 | 2 | Correct |
| 2 | #1294 | c11111 | 2021-01-05 | 1 | Should be 0 as order in same month |
| 3 | #1290 | c11111 | 2021-01-01 | 0 | Correct |
| 4 | #1238 | c12345 | 2021-12-15 | 4 | Should be 2 as last orders out of range |
| 5 | #1201 | c12345 | 2021-10-10 | 3 | Should be 2 as last orders out of range |
| 6 | #1198 | c12345 | 2021-07-05 | 2 | Should be 1 as last order out of range |
| 7 | #1134 | c12345 | 2020-10-15 | 1 | Should be 0 as last order out of range |
| 8 | #1112 | c12345 | 2019-08-10 | 0 | Should be NULL as >12m prior orders |
| 9 | #1284 | c22222 | 2020-01-15 | 0 | Should be NULL as >12m prior orders |
非常感谢任何建议...
考虑以下方法
select *,
count(order_id) over last_12m as prev_12m_orders
from input_data
window last_12m as (
partition by cust_id
order by cast(format_date('%Y%m', order_date ) as int64)
range between 100 preceding and 1 preceding
)
如果应用于您问题中的示例数据 - 输出为
用 null
满足您的第二个要求 - 使用下面的
select * except(prev_12m_orders, prior_12m_orders),
if(prev_12m_orders = 0, if(prior_12m_orders = 1, null, 0), prev_12m_orders) as prev_12m_orders
from (
select *,
count(order_id) over last_12m as prev_12m_orders,
count(order_id) over prior_12m as prior_12m_orders
from input_data
window last_12m as (
partition by cust_id
order by cast(format_date('%Y%m', order_date ) as int64)
range between 100 preceding and 1 preceding
), prior_12m as (
partition by cust_id
order by cast(format_date('%Y%m', order_date ) as int64)
range between unbounded preceding and current row
)
)
有输出