BigQuery Google 分析:访问一组页面或具有特定自定义维度值的会话数

BigQuery Google Analytics: number of sessions that visited a set of pages or had specific custom dimension values

使用 Google BigQuery 中的 Google Analytics 数据,我正在尝试重新创建 GA 细分并提取会话等基本指标。段定义为: 自定义维度 A = 1 或 5 要么 自定义维度 B 有值 要么 页面=页面A 要么 页=页B

以下代码适用于两个自定义维度条件。添加页面部分后,结果始终是当天的所有会话。

SELECT 

  COUNT(DISTINCT CONCAT(fullVisitorId, CAST(visitStartTime AS STRING))) AS Sessions
    FROM
    `123456789.ga_sessions_20200201` as Data,
    unnest(hits) as hits

  WHERE totals.visits = 1
-- custom dimension A
  and (
       (SELECT
          value
        FROM
          UNNEST(hits.customDimensions)
        WHERE
          index = 9
        GROUP BY
          1) is not null
-- custom dimension B
    or (SELECT
          value
        FROM
          UNNEST(hits.customDimensions)
        WHERE
          index = 10
        GROUP BY
          1) in ('1','5') 
-- Page
  or Exists(Select
            hits.page.pagePath AS Page
         FROM
           `123456789.ga_sessions_20200201` ,
            unnest(hits) as hits
         Where totals.visits = 1
              AND hits.type = 'PAGE'
              and hits.page.pagePath in ('pageA','pageB'))
       )

你的错误是,你在第三个条件中使用了 ga_sessions table。因此,它扫描所有 table 且存在条件 returns 为真。因此,所有行都成立。

此外,您不必加入 unnest(hits)。它为每个会话创建多行。如果您在不加入嵌套命中的情况下处理它,则每个会话只有一行。这样一来,数起来就容易多了。

我也对其进行了更新,简化了查询,所以我认为这会为您提供您想要的数据集。

SELECT 
    COUNT(*) AS Sessions
FROM
    `123456789.ga_sessions_20200201` as Data

WHERE totals.visits = 1
    and exists (
        SELECT
          1
        FROM
          UNNEST(hits) as hit
        WHERE
          EXISTS (select 1 from unnest(hit.customDimensions) where index = 9 and value is not null) -- custom dimension A
          OR EXISTS (select 1 from unnest(hit.customDimensions) where index = 10 and value in ('1', '5')) -- custom dimension B
          OR (hit.type = 'PAGE' and hit.page.PagePath in ('pageA', 'pageB')) -- Page
    )