运行客户 ID Bigquery 的出现次数

Question

这里有类似的问题，但要么我不知道如何转换到我的情况（可能），要么它们不那么相似但阅读接近我想做的事情（BigQuery: How to calculate the running count of distinct visitors for each day and category?）

总之...

我在 bigquery 中有一个订单 table，它有很多列 headers，我需要使用所有这些列，但我将在这里列出其中的一些

orderID、customerID、transactionDate、Revenue

（我需要获取所有字段）

我想计算出 table 中客户 ID 的实例作为一个新列，所以如果我下了 3 个订单，并且我的客户 ID 是 1234，数据中的第一个实例 table 将是新列中的 1，第二个将是 2，第三个将是 3

例如说我的数据是这样的

> OrderID     ||    CustomerID    ||    TransactionDate    ||    Revenue
> 1           ||    1             ||     01/01/15          ||     £20 
> 2           ||    2             ||     01/01/15          ||     £20 
> 3           ||    3             ||     01/01/15          ||     £20 
> 4           ||    1             ||     01/01/15          ||     £20 
> 5           ||    1             ||     01/01/15          ||     £20 
> 6           ||    2             ||     01/01/15          ||     £20 
> 7           ||    4             ||     01/01/15          ||     £20

我想运行对其进行查询，在新列中添加说明实例，如果有 CustomerID 记录，那么它会喜欢

> OrderID     ||    CustomerID    ||    TransactionDate    ||    Revenue ||Instance
> 1           ||    1             ||     01/01/15          ||     £20    ||1 
> 2           ||    2             ||     01/01/15          ||     £20    ||1
> 3           ||    3             ||     01/01/15          ||     £20    ||1 
> 4           ||    1             ||     01/01/15          ||     £20    ||2
> 5           ||    1             ||     01/01/15          ||     £20    ||3 
> 6           ||    2             ||     01/01/15          ||     £20    ||2 
> 7           ||    4             ||     01/01/15          ||     £20    ||1

每当出现一个已经看到的 customerID 时，实例就会递增 1

此外，我还需要运行针对不断增长的 table，目前有 160 万行。

希望有人能帮帮我。

干杯

约翰

Answer 1

您应该使用 window 函数，例如 row_number OVER（按 transaction_date 按您的组按字段排序）

Answer 2

Window Functions 正在帮助您：

Window 函数可以对结果集的特定分区或 "window" 进行计算。每个 window 函数都需要一个 OVER 子句来指定分区，语法如下：

OVER (
      [PARTITION BY <expr>]
      [ORDER BY <expr>]
      [ROWS <expr> | RANGE <expr>]
     )

PARTITION BY 始终是可选的。 ORDER BY 在某些情况下是可选的，但某些 window 函数，例如 rank() 或 dense_rank()，需要子句。

JOIN EACH 和 GROUP EACH BY 子句不能用于 window 函数的输出。要在使用 window 函数时生成大型查询结果，您必须使用 PARTITION BY.

select *,
row_number() over (partition by CustomerID order by TransactionDate) as Instance
from  (select 1 as OrderID, 1 as CustomerID, '01/01/15' as TransactionDate,'£20' as Revenue), 
 (select 2 as OrderID, 2 as CustomerID, '01/01/15' as TransactionDate,'£20' as Revenue), 
 (select 3 as OrderID, 3 as CustomerID, '01/01/15' as TransactionDate,'£20' as Revenue), 
 (select 4 as OrderID, 1 as CustomerID, '01/01/15' as TransactionDate,'£20' as Revenue), 
 (select 5 as OrderID, 1 as CustomerID, '01/01/15' as TransactionDate,'£20' as Revenue), 
 (select 6 as OrderID, 2 as CustomerID, '01/01/15' as TransactionDate,'£20' as Revenue), 
 (select 7 as OrderID, 4 as CustomerID, '01/01/15' as TransactionDate,'£20' as Revenue)
 order by OrderID

Returns:

+-----+---------+------------+-----------------+---------+----------+---+
| Row | OrderID | CustomerID | TransactionDate | Revenue | Instance |   |
+-----+---------+------------+-----------------+---------+----------+---+
|   1 |       1 |          1 | 01/01/15        | £20     |        1 |   |
|   2 |       2 |          2 | 01/01/15        | £20     |        1 |   |
|   3 |       3 |          3 | 01/01/15        | £20     |        1 |   |
|   4 |       4 |          1 | 01/01/15        | £20     |        2 |   |
|   5 |       5 |          1 | 01/01/15        | £20     |        3 |   |
|   6 |       6 |          2 | 01/01/15        | £20     |        2 |   |
|   7 |       7 |          4 | 01/01/15        | £20     |        1 |   |
+-----+---------+------------+-----------------+---------+----------+---+

运行客户 ID Bigquery 的出现次数

Running count of apperance of customer id Bigquery

google-bigquery

运行 客户 ID Bigquery 的出现次数

Running count of apperance of customer id Bigquery

google-bigquery

运行客户 ID Bigquery 的出现次数