在 AI-Platform-Notebooks 中使用 BigQuery 的 ContextualVersionConflict
ContextualVersionConflict using BigQuery in AI-Platform-Notebooks
我正尝试在 AI-Platform-Notebooks 中使用 BigQuery,但我 运行 遇到了 ContextualVersionConflict。
在这个玩具示例中,我试图从名为 bgt_all 的 BigQuery 数据库中提取两列数据,在项目 job2vec.
中
from google.cloud import bigquery
client = bigquery.Client()
aaa="""
SELECT BGTJobId, soc6 FROM `job2vec.bq_bgt_storage.bgt_all` LIMIT 100
"""
df = client.query(aaa).to_dataframe()
df.head()
哪个returns
---------------------------------------------------------------------------
ContextualVersionConflict Traceback (most recent call last)
<ipython-input-25-7bdfe216bcc8> in <module>
7 SELECT BGTJobId, soc6 FROM `job2vec.bq_bgt_storage.bgt_all` LIMIT 100
8 """
----> 9 df = client.query(aaa).to_dataframe()
10 df.head()
/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/job.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client, date_as_object)
3381 progress_bar_type=progress_bar_type,
3382 create_bqstorage_client=create_bqstorage_client,
-> 3383 date_as_object=date_as_object,
3384 )
3385
/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/table.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client, date_as_object)
1725 progress_bar_type=progress_bar_type,
1726 bqstorage_client=bqstorage_client,
-> 1727 create_bqstorage_client=create_bqstorage_client,
1728 )
1729 df = record_batch.to_pandas(date_as_object=date_as_object)
/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/table.py in to_arrow(self, progress_bar_type, bqstorage_client, create_bqstorage_client)
1535 owns_bqstorage_client = False
1536 if not bqstorage_client and create_bqstorage_client:
-> 1537 bqstorage_client = self.client._create_bqstorage_client()
1538 owns_bqstorage_client = bqstorage_client is not None
1539
/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/client.py in _create_bqstorage_client(self)
402 """
403 try:
--> 404 from google.cloud import bigquery_storage_v1
405 except ImportError:
406 warnings.warn(
/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery_storage_v1/__init__.py in <module>
20
21 __version__ = pkg_resources.get_distribution(
---> 22 "google-cloud-bigquery-storage"
23 ).version # noqa
24
/opt/conda/lib/python3.7/site-packages/pkg_resources/__init__.py in get_distribution(dist)
478 dist = Requirement.parse(dist)
479 if isinstance(dist, Requirement):
--> 480 dist = get_provider(dist)
481 if not isinstance(dist, Distribution):
482 raise TypeError("Expected string, Requirement, or Distribution", dist)
/opt/conda/lib/python3.7/site-packages/pkg_resources/__init__.py in get_provider(moduleOrReq)
354 """Return an IResourceProvider for the named module or requirement"""
355 if isinstance(moduleOrReq, Requirement):
--> 356 return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
357 try:
358 module = sys.modules[moduleOrReq]
/opt/conda/lib/python3.7/site-packages/pkg_resources/__init__.py in require(self, *requirements)
897 included, even if they were already activated in this working set.
898 """
--> 899 needed = self.resolve(parse_requirements(requirements))
900
901 for dist in needed:
/opt/conda/lib/python3.7/site-packages/pkg_resources/__init__.py in resolve(self, requirements, env, installer, replace_conflicting, extras)
788 # Oops, the "best" so far conflicts with a dependency
789 dependent_req = required_by[req]
--> 790 raise VersionConflict(dist, req).with_context(dependent_req)
791
792 # push the new requirements onto the stack
ContextualVersionConflict: (google-api-core 1.22.1 (/opt/conda/lib/python3.7/site-packages), Requirement.parse('google-api-core[grpc]<2.0.0dev,>=1.22.2'), {'google-cloud-bigquery-storage'})
这很奇怪,因为当我 运行
!pip 安装 google-api-core --upgrade
它显示它是 1.24.1 所以我不太明白为什么。
已编辑:当我键入 !conda list | 时出现以下内容grep google
google-api-core-grpcio-gcp 1.16.0 1 conda-forge
google-api-python-client 1.9.1 pyh9f0ad1d_0 conda-forge
google-apitools 0.5.31 pypi_0 pypi
google-auth 1.24.0 pypi_0 pypi
google-auth-httplib2 0.0.3 py_3 conda-forge
google-auth-oauthlib 0.4.1 py_2 conda-forge
google-cloud-bigquery 1.24.0 pypi_0 pypi
google-cloud-bigquery-storage 2.1.0 pypi_0 pypi
google-cloud-bigtable 1.0.0 pypi_0 pypi
google-cloud-core 1.3.0 pypi_0 pypi
google-cloud-dataproc 1.1.1 pypi_0 pypi
google-cloud-datastore 1.7.4 pypi_0 pypi
google-cloud-dlp 0.13.0 pypi_0 pypi
google-cloud-firestore 1.8.1 pypi_0 pypi
google-cloud-kms 1.4.0 pypi_0 pypi
google-cloud-language 1.3.0 pypi_0 pypi
google-cloud-logging 1.15.1 pypi_0 pypi
google-cloud-pubsub 1.0.2 pypi_0 pypi
google-cloud-scheduler 1.3.0 pypi_0 pypi
google-cloud-spanner 1.17.1 pypi_0 pypi
google-cloud-speech 1.3.2 pypi_0 pypi
google-cloud-storage 1.30.0 pypi_0 pypi
google-cloud-tasks 1.5.0 pypi_0 pypi
google-cloud-translate 2.0.2 pypi_0 pypi
google-cloud-videointelligence 1.13.0 pypi_0 pypi
google-cloud-vision 0.42.0 pypi_0 pypi
google-crc32c 0.1.0 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
google-resumable-media 0.7.1 pypi_0 pypi
googleapis-common-protos 1.51.0 py37hc8dfbb8_2 conda-forge
grpc-google-iam-v1 0.12.3 pypi_0 pypi
为了进一步为社区做出贡献,我根据上面的评论发布了答案。
首先,您应该尝试使用以下命令升级软件包:
pip install --upgrade pandas-gbq 'google-cloud-bigquery[bqstorage,pandas]'
然后,您可以使用 read_gbq() 方法代替 to_dataframe() 方法,它使用环境的默认项目从 BigQuery 加载数据,如下:
import pandas
sql = """
SELECT name
FROM `bigquery-public-data.usa_names.usa_1910_current`
WHERE state = 'TX'
LIMIT 100
"""
# Run a Standard SQL query using the environment's default project
df = pandas.read_gbq(sql, dialect='standard')
# Run a Standard SQL query with the project set explicitly
project_id = 'your-project-id'
df = pandas.read_gbq(sql, project_id=project_id, dialect='standard')
以上代码摘自文档,here。
在深度学习 VM 映像中 version 50 the library google-api-core-grpcio-gcp was pinned to version 1.16 because of a library issue。
后来这个库似乎与 google-cloud-bigquery-storage 冲突,后者需要更新的版本(1.22 or greater). If you start using Deep Learning VM 59+ version 其中 pin删除了你不应该看到这个问题:
google-api-core 1.22.4 pyh9f0ad1d_0 conda-forge
google-api-core-grpcio-gcp 1.22.2 hc8dfbb8_0 conda-forge
您可以创建一个全新的 notebook or If using Notebooks API we also provide an upgrade 端点,您可以使用它升级到最新的 DLVM 版本。
我正尝试在 AI-Platform-Notebooks 中使用 BigQuery,但我 运行 遇到了 ContextualVersionConflict。 在这个玩具示例中,我试图从名为 bgt_all 的 BigQuery 数据库中提取两列数据,在项目 job2vec.
中from google.cloud import bigquery
client = bigquery.Client()
aaa="""
SELECT BGTJobId, soc6 FROM `job2vec.bq_bgt_storage.bgt_all` LIMIT 100
"""
df = client.query(aaa).to_dataframe()
df.head()
哪个returns
---------------------------------------------------------------------------
ContextualVersionConflict Traceback (most recent call last)
<ipython-input-25-7bdfe216bcc8> in <module>
7 SELECT BGTJobId, soc6 FROM `job2vec.bq_bgt_storage.bgt_all` LIMIT 100
8 """
----> 9 df = client.query(aaa).to_dataframe()
10 df.head()
/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/job.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client, date_as_object)
3381 progress_bar_type=progress_bar_type,
3382 create_bqstorage_client=create_bqstorage_client,
-> 3383 date_as_object=date_as_object,
3384 )
3385
/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/table.py in to_dataframe(self, bqstorage_client, dtypes, progress_bar_type, create_bqstorage_client, date_as_object)
1725 progress_bar_type=progress_bar_type,
1726 bqstorage_client=bqstorage_client,
-> 1727 create_bqstorage_client=create_bqstorage_client,
1728 )
1729 df = record_batch.to_pandas(date_as_object=date_as_object)
/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/table.py in to_arrow(self, progress_bar_type, bqstorage_client, create_bqstorage_client)
1535 owns_bqstorage_client = False
1536 if not bqstorage_client and create_bqstorage_client:
-> 1537 bqstorage_client = self.client._create_bqstorage_client()
1538 owns_bqstorage_client = bqstorage_client is not None
1539
/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery/client.py in _create_bqstorage_client(self)
402 """
403 try:
--> 404 from google.cloud import bigquery_storage_v1
405 except ImportError:
406 warnings.warn(
/opt/conda/lib/python3.7/site-packages/google/cloud/bigquery_storage_v1/__init__.py in <module>
20
21 __version__ = pkg_resources.get_distribution(
---> 22 "google-cloud-bigquery-storage"
23 ).version # noqa
24
/opt/conda/lib/python3.7/site-packages/pkg_resources/__init__.py in get_distribution(dist)
478 dist = Requirement.parse(dist)
479 if isinstance(dist, Requirement):
--> 480 dist = get_provider(dist)
481 if not isinstance(dist, Distribution):
482 raise TypeError("Expected string, Requirement, or Distribution", dist)
/opt/conda/lib/python3.7/site-packages/pkg_resources/__init__.py in get_provider(moduleOrReq)
354 """Return an IResourceProvider for the named module or requirement"""
355 if isinstance(moduleOrReq, Requirement):
--> 356 return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
357 try:
358 module = sys.modules[moduleOrReq]
/opt/conda/lib/python3.7/site-packages/pkg_resources/__init__.py in require(self, *requirements)
897 included, even if they were already activated in this working set.
898 """
--> 899 needed = self.resolve(parse_requirements(requirements))
900
901 for dist in needed:
/opt/conda/lib/python3.7/site-packages/pkg_resources/__init__.py in resolve(self, requirements, env, installer, replace_conflicting, extras)
788 # Oops, the "best" so far conflicts with a dependency
789 dependent_req = required_by[req]
--> 790 raise VersionConflict(dist, req).with_context(dependent_req)
791
792 # push the new requirements onto the stack
ContextualVersionConflict: (google-api-core 1.22.1 (/opt/conda/lib/python3.7/site-packages), Requirement.parse('google-api-core[grpc]<2.0.0dev,>=1.22.2'), {'google-cloud-bigquery-storage'})
这很奇怪,因为当我 运行 !pip 安装 google-api-core --upgrade 它显示它是 1.24.1 所以我不太明白为什么。
已编辑:当我键入 !conda list | 时出现以下内容grep google
google-api-core-grpcio-gcp 1.16.0 1 conda-forge
google-api-python-client 1.9.1 pyh9f0ad1d_0 conda-forge
google-apitools 0.5.31 pypi_0 pypi
google-auth 1.24.0 pypi_0 pypi
google-auth-httplib2 0.0.3 py_3 conda-forge
google-auth-oauthlib 0.4.1 py_2 conda-forge
google-cloud-bigquery 1.24.0 pypi_0 pypi
google-cloud-bigquery-storage 2.1.0 pypi_0 pypi
google-cloud-bigtable 1.0.0 pypi_0 pypi
google-cloud-core 1.3.0 pypi_0 pypi
google-cloud-dataproc 1.1.1 pypi_0 pypi
google-cloud-datastore 1.7.4 pypi_0 pypi
google-cloud-dlp 0.13.0 pypi_0 pypi
google-cloud-firestore 1.8.1 pypi_0 pypi
google-cloud-kms 1.4.0 pypi_0 pypi
google-cloud-language 1.3.0 pypi_0 pypi
google-cloud-logging 1.15.1 pypi_0 pypi
google-cloud-pubsub 1.0.2 pypi_0 pypi
google-cloud-scheduler 1.3.0 pypi_0 pypi
google-cloud-spanner 1.17.1 pypi_0 pypi
google-cloud-speech 1.3.2 pypi_0 pypi
google-cloud-storage 1.30.0 pypi_0 pypi
google-cloud-tasks 1.5.0 pypi_0 pypi
google-cloud-translate 2.0.2 pypi_0 pypi
google-cloud-videointelligence 1.13.0 pypi_0 pypi
google-cloud-vision 0.42.0 pypi_0 pypi
google-crc32c 0.1.0 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
google-resumable-media 0.7.1 pypi_0 pypi
googleapis-common-protos 1.51.0 py37hc8dfbb8_2 conda-forge
grpc-google-iam-v1 0.12.3 pypi_0 pypi
为了进一步为社区做出贡献,我根据上面的评论发布了答案。
首先,您应该尝试使用以下命令升级软件包:
pip install --upgrade pandas-gbq 'google-cloud-bigquery[bqstorage,pandas]'
然后,您可以使用 read_gbq() 方法代替 to_dataframe() 方法,它使用环境的默认项目从 BigQuery 加载数据,如下:
import pandas
sql = """
SELECT name
FROM `bigquery-public-data.usa_names.usa_1910_current`
WHERE state = 'TX'
LIMIT 100
"""
# Run a Standard SQL query using the environment's default project
df = pandas.read_gbq(sql, dialect='standard')
# Run a Standard SQL query with the project set explicitly
project_id = 'your-project-id'
df = pandas.read_gbq(sql, project_id=project_id, dialect='standard')
以上代码摘自文档,here。
在深度学习 VM 映像中 version 50 the library google-api-core-grpcio-gcp was pinned to version 1.16 because of a library issue。
后来这个库似乎与 google-cloud-bigquery-storage 冲突,后者需要更新的版本(1.22 or greater). If you start using Deep Learning VM 59+ version 其中 pin删除了你不应该看到这个问题:
google-api-core 1.22.4 pyh9f0ad1d_0 conda-forge
google-api-core-grpcio-gcp 1.22.2 hc8dfbb8_0 conda-forge
您可以创建一个全新的 notebook or If using Notebooks API we also provide an upgrade 端点,您可以使用它升级到最新的 DLVM 版本。