获取会话数据中的滚动订单计数
Get a rolling order count into session data
我有以下table
一位客户在一次会话中购买了两次。
我的目标是为 table 的每一行分配一个订单计数器。
为了达到这个目标,我使用滞后函数来调用最后一个 order_id 和最后一个 order_timestamp:
SELECT
lag(event_timestamp) over (partition by session_id order by
ecom_data.order_id) as prev_order_timestamp,
lag(ecom_data.order_id)
over (partition by session_id order by event_timestamp) as
prev_order_number
From table
我想要的输出是这样的:
问题,我没有得到之前的订单时间。相反,我从上一个事件中得到 event_timestamp。
我的第二个挑战是我不知道如何分配 order_count。我想要的输出是这样的:
理想情况下,这个订单计数应该像真实数据集中那样滚动 我不知道每个会话总共有多少订单。可以有 0 - 无限订单。
你能帮忙吗?
谢谢!
### create sample table (helps to introduce these in your questions)
WITH
base AS (
SELECT
'A' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:17:41") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'ts' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:17:42") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:27:14") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'atc' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:27:15") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'p' AS event_name,
123 AS order_id,
DATETIME("2022-05-12 10:30:47") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:30:50") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:31:01") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'atc' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:31:20") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'ts' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:31:22") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'rv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:31:32") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:31:35") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:32:49") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'p' AS event_name,
456 AS order_id,
DATETIME("2022-05-12 10:33:35") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:33:48") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'tv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:33:50") AS event_timestamp
UNION ALL
SELECT
'B' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:31:50") AS event_timestame
UNION ALL
SELECT
'B' AS client_id,
1 AS session_id,
'p' AS event_name,
123 AS order_id,
DATETIME("2022-05-12 10:33:50") AS event_timestame
UNION ALL
SELECT
'C' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 11:13:50") AS event_timestame
UNION ALL
SELECT
'C' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 11:33:50") AS event_timestame),
prev_order1 AS (
SELECT
*,
LAG(order_id) OVER (PARTITION BY client_id ORDER BY event_timestamp) AS prev_order_number1
FROM
base),
### filling in order number using your requested output
prev_order2 AS (
SELECT
*,
MAX(prev_order_number1) OVER(partition by client_id ORDER BY event_timestamp ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS prev_order_number2
FROM
prev_order1
ORDER BY
event_timestamp )
### inserting order_counter logic
SELECT
*,
DENSE_RANK() OVER(partition by client_id ORDER BY prev_order_number2) - 1 AS order_counter
FROM
prev_order2
考虑边缘情况,也许你想按其他维度进行分区,例如 client_id 与总数 table(如你现在所拥有的)。我以 client_id = B 为例。
我有以下table
一位客户在一次会话中购买了两次。
我的目标是为 table 的每一行分配一个订单计数器。
为了达到这个目标,我使用滞后函数来调用最后一个 order_id 和最后一个 order_timestamp:
SELECT
lag(event_timestamp) over (partition by session_id order by
ecom_data.order_id) as prev_order_timestamp,
lag(ecom_data.order_id)
over (partition by session_id order by event_timestamp) as
prev_order_number
From table
我想要的输出是这样的:
问题,我没有得到之前的订单时间。相反,我从上一个事件中得到 event_timestamp。
我的第二个挑战是我不知道如何分配 order_count。我想要的输出是这样的:
理想情况下,这个订单计数应该像真实数据集中那样滚动 我不知道每个会话总共有多少订单。可以有 0 - 无限订单。
你能帮忙吗?
谢谢!
### create sample table (helps to introduce these in your questions)
WITH
base AS (
SELECT
'A' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:17:41") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'ts' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:17:42") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:27:14") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'atc' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:27:15") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'p' AS event_name,
123 AS order_id,
DATETIME("2022-05-12 10:30:47") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:30:50") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:31:01") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'atc' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:31:20") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'ts' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:31:22") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'rv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:31:32") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:31:35") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:32:49") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'p' AS event_name,
456 AS order_id,
DATETIME("2022-05-12 10:33:35") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:33:48") AS event_timestamp
UNION ALL
SELECT
'A' AS client_id,
1 AS session_id,
'tv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:33:50") AS event_timestamp
UNION ALL
SELECT
'B' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 10:31:50") AS event_timestame
UNION ALL
SELECT
'B' AS client_id,
1 AS session_id,
'p' AS event_name,
123 AS order_id,
DATETIME("2022-05-12 10:33:50") AS event_timestame
UNION ALL
SELECT
'C' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 11:13:50") AS event_timestame
UNION ALL
SELECT
'C' AS client_id,
1 AS session_id,
'pv' AS event_name,
NULL AS order_id,
DATETIME("2022-05-12 11:33:50") AS event_timestame),
prev_order1 AS (
SELECT
*,
LAG(order_id) OVER (PARTITION BY client_id ORDER BY event_timestamp) AS prev_order_number1
FROM
base),
### filling in order number using your requested output
prev_order2 AS (
SELECT
*,
MAX(prev_order_number1) OVER(partition by client_id ORDER BY event_timestamp ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS prev_order_number2
FROM
prev_order1
ORDER BY
event_timestamp )
### inserting order_counter logic
SELECT
*,
DENSE_RANK() OVER(partition by client_id ORDER BY prev_order_number2) - 1 AS order_counter
FROM
prev_order2
考虑边缘情况,也许你想按其他维度进行分区,例如 client_id 与总数 table(如你现在所拥有的)。我以 client_id = B 为例。