Google API 从 BigQuery 获取数据 Table
Google API to get data from BigQuery Table
我正在尝试使用 Python 从 BigQuery Table 获取数据。我知道 BigQuery Connector 可用,我可以使用它导出 table。但是,我不想涉及 GCS(Google 云存储),这会让事情变得棘手。
我发现很少有 API 调用可以让我获得完整的 table 数据。
https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/list
另一种方法是我可以查询 BigQuery Table。
https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query
但是我无法理解我需要如何使用 Python 或 [=33= 来查询那些 API ]?
如何创建客户端?或者如何验证?
如@GrahamPolley 所述,您可以按照 documentation 的说明进行操作:
验证:
To run the client library, you must first set up authentication by
creating a service account and setting an environment variable.
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/auth/[FILE_NAME].json"
创建客户端:
BigQueryClient client = BigQueryClient.Create(projectId);
要浏览所选 table 中的数据,您可以使用 official library documentation 中的示例:
# from google.cloud import bigquery
# client = bigquery.Client()
dataset_ref = client.dataset('samples', project='bigquery-public-data')
table_ref = dataset_ref.table('shakespeare')
table = client.get_table(table_ref) # API call
# Load all rows from a table
rows = client.list_rows(table)
assert len(list(rows)) == table.num_rows
# Load the first 10 rows
rows = client.list_rows(table, max_results=10)
assert len(list(rows)) == 10
# Specify selected fields to limit the results to certain columns
fields = table.schema[:2] # first two columns
rows = client.list_rows(table, selected_fields=fields, max_results=10)
assert len(rows.schema) == 2
assert len(list(rows)) == 10
# Use the start index to load an arbitrary portion of the table
rows = client.list_rows(table, start_index=10, max_results=10)
# Print row data in tabular format
format_string = '{!s:<16} ' * len(rows.schema)
field_names = [field.name for field in rows.schema]
print(format_string.format(*field_names)) # prints column headers
for row in rows:
print(format_string.format(*row)) # prints row data
我正在尝试使用 Python 从 BigQuery Table 获取数据。我知道 BigQuery Connector 可用,我可以使用它导出 table。但是,我不想涉及 GCS(Google 云存储),这会让事情变得棘手。
我发现很少有 API 调用可以让我获得完整的 table 数据。 https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/list
另一种方法是我可以查询 BigQuery Table。 https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query
但是我无法理解我需要如何使用 Python 或 [=33= 来查询那些 API ]?
如何创建客户端?或者如何验证?
如@GrahamPolley 所述,您可以按照 documentation 的说明进行操作:
验证:
To run the client library, you must first set up authentication by creating a service account and setting an environment variable.
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/auth/[FILE_NAME].json"
创建客户端:
BigQueryClient client = BigQueryClient.Create(projectId);
要浏览所选 table 中的数据,您可以使用 official library documentation 中的示例:
# from google.cloud import bigquery
# client = bigquery.Client()
dataset_ref = client.dataset('samples', project='bigquery-public-data')
table_ref = dataset_ref.table('shakespeare')
table = client.get_table(table_ref) # API call
# Load all rows from a table
rows = client.list_rows(table)
assert len(list(rows)) == table.num_rows
# Load the first 10 rows
rows = client.list_rows(table, max_results=10)
assert len(list(rows)) == 10
# Specify selected fields to limit the results to certain columns
fields = table.schema[:2] # first two columns
rows = client.list_rows(table, selected_fields=fields, max_results=10)
assert len(rows.schema) == 2
assert len(list(rows)) == 10
# Use the start index to load an arbitrary portion of the table
rows = client.list_rows(table, start_index=10, max_results=10)
# Print row data in tabular format
format_string = '{!s:<16} ' * len(rows.schema)
field_names = [field.name for field in rows.schema]
print(format_string.format(*field_names)) # prints column headers
for row in rows:
print(format_string.format(*row)) # prints row data