获取会话数据中的滚动订单计数

Get a rolling order count into session data

我有以下table

一位客户在一次会话中购买了两次。

我的目标是为 table 的每一行分配一个订单计数器。

为了达到这个目标,我使用滞后函数来调用最后一个 order_id 和最后一个 order_timestamp:

SELECT 
lag(event_timestamp) over (partition by session_id order by
ecom_data.order_id) as prev_order_timestamp,  
lag(ecom_data.order_id)
over (partition by session_id order by event_timestamp) as
prev_order_number 
From table

我想要的输出是这样的:

问题,我没有得到之前的订单时间。相反,我从上一个事件中得到 event_timestamp。

我的第二个挑战是我不知道如何分配 order_count。我想要的输出是这样的:

理想情况下,这个订单计数应该像真实数据集中那样滚动 我不知道每个会话总共有多少订单。可以有 0 - 无限订单。

你能帮忙吗?

谢谢!


### create sample table (helps to introduce these in your questions)
WITH
  base AS (
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:17:41") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'ts' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:17:42") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:27:14") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'atc' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:27:15") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'p' AS event_name,
    123 AS order_id,
    DATETIME("2022-05-12 10:30:47") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:30:50") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:31:01") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'atc' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:31:20") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'ts' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:31:22") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'rv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:31:32") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:31:35") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:32:49") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'p' AS event_name,
    456 AS order_id,
    DATETIME("2022-05-12 10:33:35") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:33:48") AS event_timestamp
  UNION ALL
  SELECT
    'A' AS client_id,
    1 AS session_id,
    'tv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:33:50") AS event_timestamp
  UNION ALL
  SELECT
    'B' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 10:31:50") AS event_timestame
  UNION ALL
  SELECT
    'B' AS client_id,
    1 AS session_id,
    'p' AS event_name,
    123 AS order_id,
    DATETIME("2022-05-12 10:33:50") AS event_timestame
  UNION ALL
  SELECT
    'C' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 11:13:50") AS event_timestame
  UNION ALL
  SELECT
    'C' AS client_id,
    1 AS session_id,
    'pv' AS event_name,
    NULL AS order_id,
    DATETIME("2022-05-12 11:33:50") AS event_timestame),

  prev_order1 AS (
  SELECT
    *,
    LAG(order_id) OVER (PARTITION BY client_id ORDER BY event_timestamp) AS prev_order_number1
  FROM
    base),

  ### filling in order number using your requested output
  prev_order2 AS (
  SELECT
    *,
    MAX(prev_order_number1) OVER(partition by client_id ORDER BY event_timestamp ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS prev_order_number2
  FROM
    prev_order1
  ORDER BY
    event_timestamp )

### inserting order_counter logic
SELECT
  *,
  DENSE_RANK() OVER(partition by client_id ORDER BY prev_order_number2) - 1 AS order_counter
FROM
  prev_order2

考虑边缘情况,也许你想按其他维度进行分区,例如 client_id 与总数 table(如你现在所拥有的)。我以 client_id = B 为例。