使用 google-cloud-bigquery 客户端库 (Python) 从 BigQuery 读取时发生 ArrowIOError
ArrowIOError occurred reading from BigQuery with google-cloud-bigquery client library (Python)
我有一个函数可以从对 BigQuery 的查询中检索 pandas 数据框,该函数在过去几个月中运行良好。
今天,在没有任何更改的情况下,它在 GoogleColab Notebooks 中停止工作并抛出此异常:
An exception of type ArrowIOError occurred reading from BigQuery.
Arguments: ('Cannot read a negative number of bytes from
BufferReader.',)
我的代码:
def read_from_bigquery_client(bq_client, project_id, sql, curr_func):
try:
df = bq_client.query(sql, project=project_id).to_dataframe()
return df
except Exception as ex:
template = "An exception of type {0} occurred reading from BigQuery. Arguments:\n{1!r}\nFunction: {2}"
message = template.format(type(ex).__name__, ex.args, curr_func)
print(message)
return None
客户端验证:
credentials = service_account.Credentials.from_service_account_file(local_cred_filename)
bq_client = bigquery.Client(credentials=credentials,
project=credentials.project_id)
我尝试过的查询在直接应用于 BigQuery 时效果很好,而且它们之前的效果如上所述。
感谢您的帮助。
new version (1.26.0) of google-cloud-bigquery
Python library has been released on 22nd of July. It may occur issue, that haven't been detected yet. The similar issue with corresponding version has been already reported on Github,您可以在其中关注更新。另外,请报告您遇到的错误。
至于现在,ArrowIOError
的解决方法是降级 google-cloud-bigquery
库的版本。
我把google-cloud-bigquery
的版本降级到1.24.0,还是报错。
其他版本是:
pyarrow==0.11.1
pandas==0.23.4
pandas-gbq==0.7.0
google-cloud-bigquery==1.24.0
在升级我的 pandas 包之前,我一直面临同样的问题,显然我从文档中看到,高于 0.29.0 的 pandas 版本可以使用 google-cloud-bigquery
嗯
更新 pandas 的最佳方法是:
pip3 install --upgrade pandas
我有一个函数可以从对 BigQuery 的查询中检索 pandas 数据框,该函数在过去几个月中运行良好。 今天,在没有任何更改的情况下,它在 GoogleColab Notebooks 中停止工作并抛出此异常:
An exception of type ArrowIOError occurred reading from BigQuery. Arguments: ('Cannot read a negative number of bytes from BufferReader.',)
我的代码:
def read_from_bigquery_client(bq_client, project_id, sql, curr_func):
try:
df = bq_client.query(sql, project=project_id).to_dataframe()
return df
except Exception as ex:
template = "An exception of type {0} occurred reading from BigQuery. Arguments:\n{1!r}\nFunction: {2}"
message = template.format(type(ex).__name__, ex.args, curr_func)
print(message)
return None
客户端验证:
credentials = service_account.Credentials.from_service_account_file(local_cred_filename)
bq_client = bigquery.Client(credentials=credentials,
project=credentials.project_id)
我尝试过的查询在直接应用于 BigQuery 时效果很好,而且它们之前的效果如上所述。
感谢您的帮助。
new version (1.26.0) of google-cloud-bigquery
Python library has been released on 22nd of July. It may occur issue, that haven't been detected yet. The similar issue with corresponding version has been already reported on Github,您可以在其中关注更新。另外,请报告您遇到的错误。
至于现在,ArrowIOError
的解决方法是降级 google-cloud-bigquery
库的版本。
我把google-cloud-bigquery
的版本降级到1.24.0,还是报错。
其他版本是:
pyarrow==0.11.1
pandas==0.23.4
pandas-gbq==0.7.0
google-cloud-bigquery==1.24.0
在升级我的 pandas 包之前,我一直面临同样的问题,显然我从文档中看到,高于 0.29.0 的 pandas 版本可以使用 google-cloud-bigquery
嗯
更新 pandas 的最佳方法是:
pip3 install --upgrade pandas