如何将 BigQuery 的 union all 与内部联接一起使用?

How to use BigQuery's union all with an inner join?

我正在尝试将所有 comment table(每个月的评论碎片)加入 posts table。有没有办法在内部联接之前执行联合?可以在文档 here 中找到有关 union all 运算符的详细信息。我的查询只有一条评论 table 如下:

SELECT c.score, c.body, c.link_id, c.parent_id, p.created_utc, c.created_utc
FROM [fh-bigquery:reddit_comments.2016_01] AS c
INNER JOIN [fh-bigquery:reddit_posts.full_corpus_201512] AS p
ON c.parent_id = p.name
WHERE SUBSTR(c.parent_id, 1, 2) = 't3'
ORDER BY c.score DESC
LIMIT 10

替换

FROM [fh-bigquery:reddit_comments.2016_01] AS c  

FROM (
  SELECT score, body, link_id, parent_id, created_utc 
  FROM (TABLE_QUERY([fh-bigquery:reddit_comments], 
                    'REGEXP_MATCH(table_id, r"\d{4}_\d{2}")')) 
) AS c  

希望,这给了你想法
查看更多 Table wildcard functions and Regular expression functions

正如 Mikhail Berlyant 在他的回答中指出的那样,修改查询就完成了我所需要的。

SELECT c.score, c.body, c.link_id, c.parent_id, p.created_utc, c.created_utc, (c.created_utc - p.created_utc) AS time_diff
FROM (
  SELECT * 
  FROM 
    [fh-bigquery:reddit_comments.2015_11],
    [fh-bigquery:reddit_comments.2015_12],
    [fh-bigquery:reddit_comments.2016_01],
) AS c
INNER JOIN [fh-bigquery:reddit_posts.full_corpus_201512] AS p
ON c.parent_id = p.name
WHERE SUBSTR(c.parent_id, 1, 2) = 't3'
ORDER BY c.score DESC
LIMIT 100