BigQuery 中的多个左连接

Multiple Left Joins in BigQuery

我正在尝试使我在 BigQuery 中使用的当前可用的 SQL 查询更加精简,并且 运行 遇到以下问题:


错误:ON 子句必须是 AND of = 每个 table 中的一个字段名称的比较,所有字段名称都以 table 名称为前缀。考虑使用标准 SQL .google.com/bigquery/docs/reference/standard-sql/),它允许涉及表达式和剩余谓词的非相等 JOIN 和比较。


下面是给出上述错误的查询。第一个 LEFT JOIN 有效。当我在下方添加第二个时,我开始收到错误消息。我想要做的是获取人类可读的 own.o.firstname 和 own.o.lastname 值,而不是交易记录 (o.properties.hubspot_owner_id.value) 的 owner_id 值,但为了所以我需要加入一些 tables.

我不得不在第二个 JOIN 的 ON 子句上使用 CAST,因为字段在每个 table 各自的模式中是不同类型的。如果我不这样做,我会收到以下错误:错误:连接键 o.properties.hubspot_owner_id.value(字符串)和 o.ownerid(int64)具有无法自动强制转换的类型。

WHERE 子句只是一个禁止列表,不包含已从数据库中删除的 return 个条目。

SELECT o.*
FROM (
  SELECT
    o.dealid,
    o.properties.dealname.value,
    stages.Label,
    o.properties.closedate.value,
    o.properties.hubspot_owner_id.value,
    own.o.firstname,
    own.o.lastname,
    o.properties.amount.value,
    o.properties.createdate.value,
    o.properties.pipeline.value,
    o.associations.associatedcompanyids,
    ROW_NUMBER() OVER (PARTITION BY o.dealid ORDER BY o._sdc_batched_at DESC) as seqnum
  FROM [sample-table:hubspot.deals] o
  LEFT JOIN [sample-table:hubspot.sales_stages_lookup] stages ON o.properties.dealstage.value = stages.Internal_Value
  LEFT JOIN [sample-table:hubspot.owners_reporting] own ON CAST(o.properties.hubspot_owner_id.value AS INTEGER) = CAST(own.o.ownerid AS INTEGER)) o
WHERE o.dealid NOT IN (SELECT objectid FROM [sample-table:hubspot_suppression_list.data] WHERE subscriptiontype = 'deal.deletion') AND seqnum = 1

改为在 BigQuery 中使用 standard SQL,它支持表达式作为 ON 子句的一部分:

#standardSQL
SELECT o.*
FROM (
  SELECT
    o.dealid,
    o.properties.dealname.value AS dealname_value,
    stages.Label,
    o.properties.closedate.value AS closedate_value,
    o.properties.hubspot_owner_id.value AS hubspot_owner_id_value,
    own.o.firstname,
    own.o.lastname,
    o.properties.amount.value AS amount_value,
    o.properties.createdate.value AS createdate_value,
    o.properties.pipeline.value AS pipeline_value,
    o.associations.associatedcompanyids,
    ROW_NUMBER() OVER (PARTITION BY o.dealid ORDER BY o._sdc_batched_at DESC) as seqnum
  FROM `sample-table.hubspot.deals` o
  LEFT JOIN `sample-table.hubspot.sales_stages_lookup` stages ON o.properties.dealstage.value = stages.Internal_Value
  LEFT JOIN `sample-table.hubspot.owners_reporting` own ON CAST(o.properties.hubspot_owner_id.value AS INT64) = CAST(own.o.ownerid AS INT64)) o
WHERE o.dealid NOT IN (SELECT objectid FROM `sample-table.hubspot_suppression_list.data` WHERE subscriptiontype = 'deal.deletion') AND seqnum = 1

有关 BigQuery 中旧版和标准 SQL 之间差异的更多信息,请参阅 migration guide