在 _TABLE_SUFFIX 中使用通配符在 BigQuery 中查询 table 视图
Querying table views in BigQuery using wildcards with _TABLE_SUFFIX
我尝试使用 [=50= 在 Google BigQuery 中查询大量 (~140) 个不同的 table views ]。但这会导致以下错误信息:
"Views cannot be queried through prefix."
目前我正在使用这个代码:
SELECT
tableDate,
`TableA.20*`.ip AS IP,
`TableB.20*`.city AS city,
....
CAST(s.banner AS string) AS sourcecode,
FROM
`TableA.20*`
CROSS JOIN UNNEST(services) AS s
FULL OUTER JOIN `TableB.20*` USING(ip)
WHERE
_TABLE_SUFFIX IN (SELECT table_date FROM `datasetX.dates_table` AS tableDate)
AND
REGEXP_MATCH(cast(s.banner AS string), r'(?i) .....
“dates_table”的结构:
table_date
----------
190305
190312
190319
190326
...
[weekly dates]
原始数据集是这样的:
正如我在 BigQuery documentation 中所读到的,通配符只能用于遗留 SQL,无法使用通配符查询 >views<。
我的简单问题是:从不同视图查询数据的替代方法是什么?是否有另一种使用通配符遍历视图的方法?
可能但行不通的解决方案:
建议的解决方案 here are unfortunately not possible in my case. I cannot change the data set, as it is a set from a external provider. Trying to expose the _TABLE_SUFFIX column, as suggested , does also not work in my case. Using UNION ALL for example, as suggested ,对于 140 tables 是不可能的。
我也很高兴有一个使用 BigQuery 标准的解决方案 SQL 这样我就可以使用例如REGEXP_CONTAIN.
有什么想法吗?那太好了!非常感谢。
弗兰克
只有(suitable)解
由于无法迭代 table 视图(无论出于何种技术原因),因此有必要 hard-define table 查看名称.因此,我建议使用 UNION ALL 和 hard-code table 视图名称。这意味着更长的代码并且没有 automated/iterative 过程,但至少它有效。 ;-)
SELECT
tableDate,
`TableA.20190305`.ip AS IP,
`TableB.20190305*`.city AS city,
CAST(s.banner AS string) AS sourcecode,
FROM
`TableA.20*`
CROSS JOIN UNNEST(services) AS s
FULL OUTER JOIN `TableB.20190305` USING(ip)
UNION ALL
SELECT
tableDate,
`TableA.20190312`.ip AS IP,
`TableB.20190312`.city AS city,
CAST(s.banner AS string) AS sourcecode,
FROM
`TableA.20*`
CROSS JOIN UNNEST(services) AS s
FULL OUTER JOIN `TableB.20190312` USING(ip)
UNION ALL
<all further tables>
WHERE
REGEXP_MATCH(cast(s.banner AS string), r'(?i) .....
我尝试使用 [=50= 在 Google BigQuery 中查询大量 (~140) 个不同的 table views ]。但这会导致以下错误信息:
"Views cannot be queried through prefix."
目前我正在使用这个代码:
SELECT
tableDate,
`TableA.20*`.ip AS IP,
`TableB.20*`.city AS city,
....
CAST(s.banner AS string) AS sourcecode,
FROM
`TableA.20*`
CROSS JOIN UNNEST(services) AS s
FULL OUTER JOIN `TableB.20*` USING(ip)
WHERE
_TABLE_SUFFIX IN (SELECT table_date FROM `datasetX.dates_table` AS tableDate)
AND
REGEXP_MATCH(cast(s.banner AS string), r'(?i) .....
“dates_table”的结构:
table_date
----------
190305
190312
190319
190326
...
[weekly dates]
原始数据集是这样的:
正如我在 BigQuery documentation 中所读到的,通配符只能用于遗留 SQL,无法使用通配符查询 >views<。 我的简单问题是:从不同视图查询数据的替代方法是什么?是否有另一种使用通配符遍历视图的方法?
可能但行不通的解决方案:
建议的解决方案 here are unfortunately not possible in my case. I cannot change the data set, as it is a set from a external provider. Trying to expose the _TABLE_SUFFIX column, as suggested
我也很高兴有一个使用 BigQuery 标准的解决方案 SQL 这样我就可以使用例如REGEXP_CONTAIN.
有什么想法吗?那太好了!非常感谢。
弗兰克
只有(suitable)解
由于无法迭代 table 视图(无论出于何种技术原因),因此有必要 hard-define table 查看名称.因此,我建议使用 UNION ALL 和 hard-code table 视图名称。这意味着更长的代码并且没有 automated/iterative 过程,但至少它有效。 ;-)
SELECT
tableDate,
`TableA.20190305`.ip AS IP,
`TableB.20190305*`.city AS city,
CAST(s.banner AS string) AS sourcecode,
FROM
`TableA.20*`
CROSS JOIN UNNEST(services) AS s
FULL OUTER JOIN `TableB.20190305` USING(ip)
UNION ALL
SELECT
tableDate,
`TableA.20190312`.ip AS IP,
`TableB.20190312`.city AS city,
CAST(s.banner AS string) AS sourcecode,
FROM
`TableA.20*`
CROSS JOIN UNNEST(services) AS s
FULL OUTER JOIN `TableB.20190312` USING(ip)
UNION ALL
<all further tables>
WHERE
REGEXP_MATCH(cast(s.banner AS string), r'(?i) .....