BigQuery - 使用 WHERE 子句按日期分组

BigQuery - Grouping by date with a WHERE clause

我的目标是搜索某些产品,计算每个产品的总数,然后按天 每个 分组。

架构是这样的:

line_items.sku created_at
product1 2020-04-02T13:22:44
product2 2020-04-02T05:01:22
product2 2020-04-03T14:21:10

我的查询如下:

SELECT
  EXTRACT(DAY
  FROM
    CAST(`order`.created_at AS DATETIME)) AS day_extracted,
  EXTRACT(MONTH
  FROM
    CAST(`order`.created_at AS DATETIME)) AS month_extracted,
  `order`.line_items.sku AS sku
FROM
  `mydatabase`
WHERE
  `order`.line_items.sku = "product 1"
  OR `order`.line_items.sku = "product 2"

数据如下:

row day_extracted month_extracted sku
1 5 2 product1
2 4 1 product2
2 4 1 product1

这很棒而且有效,但我 运行 遇到需要对产品进行分组并每天计算 每个产品 总数的问题。

我做错了什么?如果我添加

  GROUP BY month_extracted, day_extracted

查询,出现另一个错误

SELECT list expression references `order`.line_items which is neither grouped nor aggregated at [8:3]

第 8 行是:

`order`.line_items.sku AS sku

一般 SQL 查询的计算顺序是这样

这意味着group by子句甚至不知道什么是month_extracted, day_extracted。所以为了解决这个问题,要么将整个 exp EXTRACT(.. 放在 OR 组中。使用子查询。还有一条规则,SELECT 中不属于 GROUP BY 的任何内容都应应用 AGGREGATE 函数。因此在你那里这不是错误。

select 
       day_extracted,
       month_extracted,
       any_value(sku) AS sku -- i used any_value to fix it, you can use any other agg. function as per your logic
 from (
SELECT
  EXTRACT(DAY
  FROM
    CAST(`order`.created_at AS DATETIME)) AS day_extracted,
  EXTRACT(MONTH
  FROM
    CAST(`order`.created_at AS DATETIME)) AS month_extracted,
    `order`.line_items.sku as Sku
  
FROM
  `mydatabase`
WHERE
  `order`.line_items.sku = "product 1"
  OR `order`.line_items.sku = "product 2"
) as _table
group by day_extracted,month_extracted

Mr.Batra 让我陷入了子查询的困境,这让我找到了解决方案。了解执行哪些订单查询现在也更有意义。

SELECT day_extracted,month_extracted,Sku,count(*) FROM 
    (
SELECT
    EXTRACT(DAY
    FROM
      CAST(`order`.created_at AS DATETIME)) AS day_extracted,
    EXTRACT(MONTH
    FROM
      CAST(`order`.created_at AS DATETIME)) AS month_extracted,
    `order`.line_items.sku AS Sku
  FROM
    `mydatabase`
  WHERE
    `order`.line_items.sku = "product1"
    OR `order`.line_items.sku = "product2"
    ) AS temp
    GROUP BY temp.Sku,day_extracted,month_extracted
    ORDER BY day_extracted

这给了我这种格式的数据:

day_extracted month_extracted Sku col1
1 2 product1 41
1 2 product2 55
2 2 product1 91