如果指定了 DISTINCT,则不允许使用 BigQuery Window ORDER BY

BigQuery Window ORDER BY is not allowed if DISTINCT is specified

我正在研究移植一些遗留的 bigquery sql,其中包含这样的窗口化非重复计数

count(distinct brand_id) over (partition by user_id order by order_placed_at range between 7 * 24 * 60 * 60 * 1000000 PRECEDING AND 1 PRECEDING) as last_7_day_buyer_brands

标准sql.....但我得到这个错误....

Window ORDER BY is not allowed if DISTINCT is specified

作为参考,我尝试了 APPROX_COUNT_DISTINCT 功能,但没有成功。

除了编写子查询和分组依据之外,是否有更好的方法让它工作?

大多数其他查询都已移植到标准 sql,只有微小的变化。

documentation

OVER clause requirements:

PARTITION BY: Optional.
ORDER BY: Optional. Disallowed if DISTINCT is present.
window_frame_clause: Optional. Disallowed if DISTINCT is present.

注意:以上是我'highlighted',而不是文档

正如你所看到的,不仅ORDER BY而且在使用DISTINCT时甚至RANGE BETWEEN也是不允许的

我认为,子查询是可行的方法。

如果您需要指导,请使用下面的简单示例

#standardSQL
SELECT
  user_id,
  order_placed_at,
  brand_id,
  (SELECT COUNT(DISTINCT brand) 
      FROM UNNEST(last_7_day_buyer_brands_with_dups) AS brand
  ) AS last_7_day_buyer_brands
FROM (
  SELECT 
    user_id,
    order_placed_at,
    brand_id,
    ARRAY_AGG(brand_id) OVER(
      PARTITION BY user_id ORDER BY order_placed_at 
      RANGE BETWEEN 7 * 24 * 60 * 60 * 1000000 PRECEDING AND 1 PRECEDING
    ) AS last_7_day_buyer_brands_with_dups
  FROM yourTable
)