BigQuery:标量子查询产生了多个 - 自定义维度

BigQuery : Scalar Subquery produced more than one - Custom Dimensions

我试图将自定义维度添加到我的一个联合中,但我遇到了标量子查询生成多个元素的问题。我认为问题出在这段代码中。我正在尝试迁移到标准 SQL 所以请用标准 SQL.

给出答案
SELECT
      d.value
FROM
      UNNEST(hits) AS hits,
      UNNEST(hits.customDimensions) AS d
WHERE
      d.index = 65) AS viewID,

查询的整体示例

#standardSQL
SELECT
  date,
  channelGrouping,
  viewID,
  SUM(Revenue) Revenue,
  SUM(Shipping) Shipping,
  SUM(bounces) bounces,
  SUM(transactions) transactions,
  COUNT(date) sessions
FROM (
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    (
    SELECT
      d.value
    FROM
      UNNEST(hits) AS hits,
      UNNEST(hits.customDimensions) AS d
    WHERE
      d.index = 65) AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703' )
GROUP BY
  date,
  channelGrouping,
  viewID

问题是部分或所有匹配项都有索引为 65 的自定义维度。有几种不同的方法可以解决此问题。您可以使用 ARRAY 子查询获取具有该索引的所有值:

ARRAY(
  SELECT
    d.value
  FROM
    UNNEST(hits) AS hits,
    UNNEST(hits.customDimensions) AS d
  WHERE
    d.index = 65) AS viewIDs,

这将为您提供所有点击的视图 ID,但您还需要在联合的第一个查询中为 viewID 使用一个数组。另一种选择是从第一次点击中获取视图 ID:

(
  SELECT
    d.value
  FROM
    UNNEST(hits[SAFE_OFFSET(0)].customDimensions) AS d
  WHERE
    d.index = 65) AS viewID

或者,如果您不关心获得的是哪个视图 ID,则可以使用 LIMIT 来获得任意 ID:

(
  SELECT
    d.value
  FROM
    UNNEST(hits) AS hits,
    UNNEST(hits.customDimensions) AS d
  WHERE
    d.index = 65
  LIMIT 1) AS viewID,

您可以在 BigQuery 中模拟一些数据,以更好地理解这里发生的事情。

例如,此数据模拟 ga_sessions 中的 hits 模式:

WITH data AS(
  select ARRAY<STRUCT<hitNumber INT64, customDimensions ARRAY<STRUCT<index INT64, value STRING>> >> [STRUCT(1 as hitNumber, [STRUCT(1 as index, 'val1' as value), STRUCT(2 as index, 'val2' as value), STRUCT(3 as index, 'val3' as value)] as customDimensions), STRUCT(2 as hitNumber, [STRUCT(1 as index, 'val1' as value)] as customDimensions)] hits
)

select * from data

现在,如果您 运行 针对此模拟数据查询 index = 1,您将得到相同的错误,因为在两个不同的地方索引为 1。

为了让它工作,你必须把它作为一个 ARRAY 像这样:

SELECT
  array(select custd.value from unnest(hits) hits, unnest(hits.customDimensions) custd where index = 1)
FROM data

你会看到结果:

因此,在您的查询中,您必须适应将此值作为 ARRAY 返回,或者,如果对于 index=65 值相同的所有值,您可以执行类似:

SELECT
  (select custd.value from unnest(hits) hits, unnest(hits.customDimensions) custd where index = 1 limit 1)
FROM data

这只会在标量子查询中带来一个结果。