BigQuery Count Instances between timeperiod Group By

BigQuery Count Instances between timeperiod Group By

我有一个订单 table 上传到 BigQuery,其中包含以下 headers

ConsumerID, TransactionDate, Revenue, OrderID

ConsumerID 和 OrderID 是整数 TransactionDate 是一个 TIMESTAMP

数据结构如下

ConsumerId   || TransactionDate          || Revenue   ||  OrderID
1            || 2014-10-27 00:00:00 UTC  || 55        ||  653745
1            || 2015-02-27 00:00:00 UTC  || 65        ||  767833
1            || 2015-12-27 00:00:00 UTC  || 456       ||  5676324
2            || 2014-10-27 00:00:00 UTC  || 56        ||  435261
2            || 2016-02-27 00:00:00 UTC  || 43        ||  5632436724

所以我的预期输出是

ConsumerId   || Count Of Orders In Last 12 months
    1        || 2
    2        || 1

我想统计客户在第一个订单日期后的前 12 个月内所下订单的数量。

在大查询中我写了以下内容

SELECT
  ConsumerId,
  COUNT(OrderNumber BETWEEN MIN(TransactionDate)AND DATE_ADD(MIN(TransactionDate),11,"MONTH")) AS CountOfOrdersTwelve,
FROM
  [ordertable.orders]
GROUP BY
  1,
  2
ORDER BY
  ConsumerId ;

然而,此错误与以下

错误:(L3:157):无法按聚合分组。

有谁知道可以在 bigquery 中完成此操作的方法吗?

供您考虑的快速选项(假设输入如下)

      (SELECT 1 AS ConsumerID, '2014-01-01' AS TransactionDate, 1 AS OrderID),
      (SELECT 1 AS ConsumerID, '2014-05-01' AS TransactionDate, 2 AS OrderID),
      (SELECT 1 AS ConsumerID, '2015-01-01' AS TransactionDate, 3 AS OrderID),
      (SELECT 1 AS ConsumerID, '2015-03-01' AS TransactionDate, 4 AS OrderID),
      (SELECT 1 AS ConsumerID, '2015-04-01' AS TransactionDate, 5 AS OrderID),
      (SELECT 1 AS ConsumerID, '2015-05-01' AS TransactionDate, 6 AS OrderID),

      (SELECT 2 AS ConsumerID, '2015-01-01' AS TransactionDate, 1 AS OrderID),
      (SELECT 2 AS ConsumerID, '2015-01-01' AS TransactionDate, 2 AS OrderID),
      (SELECT 2 AS ConsumerID, '2015-01-01' AS TransactionDate, 3 AS OrderID),
      (SELECT 2 AS ConsumerID, '2015-03-01' AS TransactionDate, 4 AS OrderID),
      (SELECT 2 AS ConsumerID, '2015-04-01' AS TransactionDate, 5 AS OrderID),
      (SELECT 2 AS ConsumerID, '2016-05-01' AS TransactionDate, 6 AS OrderID),

      (SELECT 3 AS ConsumerID, '2015-04-01' AS TransactionDate, 1 AS OrderID),
      (SELECT 3 AS ConsumerID, '2015-05-01' AS TransactionDate, 2 AS OrderID)

您的数据可能因数据类型而异,因此您需要相应地进行调整

SELECT ConsumerID, MAX(CountOfOrders) AS CountOfOrdersTwelve
FROM (
  SELECT ConsumerID, CountOfOrders
  FROM (
    SELECT
      ConsumerID, TransactionDate,
      COUNT(1) OVER(PARTITION BY ConsumerID ORDER BY TransactionDate) AS CountOfOrders,
      FIRST_VALUE(TransactionDate) 
        OVER(PARTITION BY ConsumerID ORDER BY TransactionDate) AS firstTransactionDate
    FROM [ordertable.orders]
  ) HAVING DATEDIFF(TransactionDate, firstTransactionDate) <= 365
) GROUP BY ConsumerID ORDER BY ConsumerID

Compact version

注意:此版本适用于 STRING(如上述第一个解决方案的示例)和 TIMESTAMP(如更新后的问题)数据类型对于 TransactionDate

SELECT 
  ConsumerID, CountOfOrdersTwelve
FROM (
  SELECT 
    ConsumerID,
    TIMESTAMP_TO_SEC(TIMESTAMP(TransactionDate)) AS ts,
    COUNT(ts) OVER (PARTITION BY ConsumerID ORDER BY ts 
      RANGE BETWEEN CURRENT ROW AND 365*24*3600 FOLLOWING) AS CountOfOrdersTwelve,
    ROW_NUMBER() OVER(PARTITION BY ConsumerID ORDER BY ts) AS pos
  FROM [ordertable.orders]
)
WHERE pos = 1
ORDER BY ConsumerID