如何在 Google Cloud Datalab 中使用 Bigquery JSON 函数
How to use Bigquery JSON functions in Google Cloud Datalab
我在 Google 个 Cloud Datalab 笔记本中调用 Bigquery,我想使用 JSON functions that can be used in BigQuery。然而事实证明,因为 JSON 函数使用“$”作为对字符串中段的引用,而 Cloud Datalab 使用“$”来引用全局变量;他们冲突并给出错误。
示例(无法重现,因为我在示例中找不到任何类似 JSON 的字符串)
%%sql --module events_query
SELECT JSON_EXTRACT_SCALAR(eventTypeParams, '$.restaurant-name") as str
FROM [foodit-prod:analytics.analytics_event]
当我运行事情:
events_run = bq.Query(events_query)
events = events_run.to_dataframe()
这是我收到的错误:
ExceptionTraceback (most recent call last)
in ()
----> 1 events_run = bq.Query(events_query)
2 events = events_run.to_dataframe()
/usr/local/lib/python2.7/dist-packages/gcp/bigquery/_query.pyc in
init(self, sql, context, values, udfs, data_sources, **kwargs)
90 values = kwargs
91
---> 92 self._sql = gcp.data.SqlModule.expand(sql, values, udfs)
93
94 # We need to take care not to include the same UDF code twice so we use sets.
/usr/local/lib/python2.7/dist-packages/gcp/data/_sql_module.pyc in
expand(sql, args, udfs)
127 """
128 sql, args = SqlModule.get_sql_statement_with_environment(sql, args)
--> 129 return _sql_statement.SqlStatement.format(sql._sql, args, udfs)
130
131
/usr/local/lib/python2.7/dist-packages/gcp/data/_sql_statement.pyc in
format(sql, args, udfs)
137 code = []
138 SqlStatement._find_recursive_dependencies(sql, args, code=code,
--> 139 resolved_vars=resolved_vars)
140
141 # Rebuild the SQL string, substituting just '$' for escaped $ occurrences,
/usr/local/lib/python2.7/dist-packages/gcp/data/_sql_statement.pyc in
_find_recursive_dependencies(sql, values, code, resolved_vars, resolving_vars)
80
81 # Get the set of $var references in this SQL.
---> 82 dependencies = SqlStatement._get_dependencies(sql)
83 for dependency in dependencies:
84 # Now we check each dependency. If it is in complete - i.e., we have an expansion
/usr/local/lib/python2.7/dist-packages/gcp/data/_sql_statement.pyc in
_get_dependencies(sql)
202 dependencies.append(variable)
203 elif dollar:
--> 204 raise Exception('Invalid sql; $ with no following $ or identifier: %s.' % sql)
205 return dependencies
206
Exception: Invalid sql; $ with no following $ or identifier: SELECT
JSON_EXTRACT_SCALAR(eventTypeParams, "'$'.restaurant-name") as str
FROM [foodit-prod:analytics.analytics_event].
我尝试将 $ 符号放在不同的引号中或转义等。None 成功了。有什么解决办法吗?
你能试试下面的方法吗?
%%sql
SELECT JSON_EXTRACT_SCALAR(
"{'book': {
'category':'fiction',
'title':'Harry Potter'}}",
"$$.book.category");
或者从你的例子中,
%%sql --module events_query
SELECT JSON_EXTRACT_SCALAR(eventTypeParams, '$$.restaurant-name') as str
FROM [foodit-prod:analytics.analytics_event]
我在 Google 个 Cloud Datalab 笔记本中调用 Bigquery,我想使用 JSON functions that can be used in BigQuery。然而事实证明,因为 JSON 函数使用“$”作为对字符串中段的引用,而 Cloud Datalab 使用“$”来引用全局变量;他们冲突并给出错误。
示例(无法重现,因为我在示例中找不到任何类似 JSON 的字符串)
%%sql --module events_query
SELECT JSON_EXTRACT_SCALAR(eventTypeParams, '$.restaurant-name") as str
FROM [foodit-prod:analytics.analytics_event]
当我运行事情:
events_run = bq.Query(events_query)
events = events_run.to_dataframe()
这是我收到的错误:
ExceptionTraceback (most recent call last) in () ----> 1 events_run = bq.Query(events_query) 2 events = events_run.to_dataframe()
/usr/local/lib/python2.7/dist-packages/gcp/bigquery/_query.pyc in init(self, sql, context, values, udfs, data_sources, **kwargs) 90 values = kwargs 91 ---> 92 self._sql = gcp.data.SqlModule.expand(sql, values, udfs) 93 94 # We need to take care not to include the same UDF code twice so we use sets.
/usr/local/lib/python2.7/dist-packages/gcp/data/_sql_module.pyc in expand(sql, args, udfs) 127 """ 128 sql, args = SqlModule.get_sql_statement_with_environment(sql, args) --> 129 return _sql_statement.SqlStatement.format(sql._sql, args, udfs) 130 131
/usr/local/lib/python2.7/dist-packages/gcp/data/_sql_statement.pyc in format(sql, args, udfs) 137 code = [] 138 SqlStatement._find_recursive_dependencies(sql, args, code=code, --> 139 resolved_vars=resolved_vars) 140 141 # Rebuild the SQL string, substituting just '$' for escaped $ occurrences,
/usr/local/lib/python2.7/dist-packages/gcp/data/_sql_statement.pyc in _find_recursive_dependencies(sql, values, code, resolved_vars, resolving_vars) 80 81 # Get the set of $var references in this SQL. ---> 82 dependencies = SqlStatement._get_dependencies(sql) 83 for dependency in dependencies: 84 # Now we check each dependency. If it is in complete - i.e., we have an expansion
/usr/local/lib/python2.7/dist-packages/gcp/data/_sql_statement.pyc in _get_dependencies(sql) 202 dependencies.append(variable) 203 elif dollar: --> 204 raise Exception('Invalid sql; $ with no following $ or identifier: %s.' % sql) 205 return dependencies 206
Exception: Invalid sql; $ with no following $ or identifier: SELECT JSON_EXTRACT_SCALAR(eventTypeParams, "'$'.restaurant-name") as str FROM [foodit-prod:analytics.analytics_event].
我尝试将 $ 符号放在不同的引号中或转义等。None 成功了。有什么解决办法吗?
你能试试下面的方法吗?
%%sql
SELECT JSON_EXTRACT_SCALAR(
"{'book': {
'category':'fiction',
'title':'Harry Potter'}}",
"$$.book.category");
或者从你的例子中,
%%sql --module events_query
SELECT JSON_EXTRACT_SCALAR(eventTypeParams, '$$.restaurant-name') as str
FROM [foodit-prod:analytics.analytics_event]