BigQuery 错误 – UNIQUE_HEAP 需要一个 int32 参数

BigQuery Error – UNIQUE_HEAP requires an int32 argument

使用旧版 SQL,我尝试在 Google BigQuery 中使用 COUNT(DISTINCT field, n)。但我收到以下错误:

UNIQUE_HEAP requires an int32 argument which is greater than 0 (error code: invalidQuery)

这是我使用过的查询:

SELECT
    hits.page.pagePath AS Page,
    COUNT(DISTINCT CONCAT(fullVisitorId, INTEGER(visitId)), 1e6) AS UniquePageviews,
    COUNT(DISTINCT fullVisitorId, 1e6) as Users
FROM
    [xxxxxxxx.ga_sessions_20170101]
GROUP BY
    Page
ORDER BY
    UniquePageviews DESC
LIMIT
    20

BigQuery 甚至没有显示错误的行号,因此我不确定是哪一行导致了这个错误。

What could be possible cause of above error?

不要在 COUNT(DISTINCT) 中使用 1e6。相反,对第二个参数 'N' 使用实际的 INTEGER 值(默认值为 1000),或者使用 EXACT_COUNT_DISTINCT()

COUNT(DISTINCT) documentation

EXACT_COUNT_DISTINCT() documentation

If you require greater accuracy from COUNT(DISTINCT), you can specify a second parameter, n, which gives the threshold below which exact results are guaranteed. By default, n is 1000, but if you give a larger n, you will get exact results for COUNT(DISTINCT) up to that value of n. However, giving larger values of n will reduce scalability of this operator and may substantially increase query execution time or cause the query to fail.

To compute the exact number of distinct values, use EXACT_COUNT_DISTINCT. Or, for a more scalable approach, consider using GROUP EACH BY on the relevant field(s) and then applying COUNT(*). The GROUP EACH BY approach is more scalable but might incur a slight up-front performance penalty.