将 SELECT 语句结果导出到 BigQuery 时仅创建空 table
While exporting SELECT statement result to BigQuery only empty table is created
我正在尝试将 select 语句结果导出到另一个 table 作为永久存储。但是,当创建新的 table 时,它是无模式的。当我尝试查询该结果时 table 显示错误:
Table project-id.dataset_name.temp_table does not have a schema.
这是我的代码,用于将 SELECT 语句的结果导出到临时选项卡
def query_to_table():
service_account_info = {} # account info
credentials = Credentials.from_service_account_info(
service_account_info)
client = bigquery.Client(
project=service_account_info.get("project_id"),
credentials=credentials)
query = """
SELECT
a,
b
FROM `project.dataset.table`
WHERE a NOT IN ('error', 'warning')
"""
destination_dataset = client.dataset("abc_123") #this is another dataset
destination_table = destination_dataset.table("temp_table") # destination table
try:
client.get_table(destination_table)
client.delete_table(destination_table)
except Exception as e:
# Some logging
pass
client.create_table(Table(destination_table))
# Execute the job and save to table
job_config = bigquery.QueryJobConfig()
job_config.allow_large_results = True
job_config.use_legacy_sql = False
job_config.destination = destination_table
job_config.dry_run = True
query_job = client.query(query, job_config=job_config)
# Wait till the job done
while not query_job.done():
time.sleep(1)
logging.info(f"Processed {query_job.total_bytes_processed} bytes.")
return destination_table
哪里错了? Google 云方面是否有任何 API 变化?
因为这个脚本在一个月前就开始工作了。
请帮忙。
妈的!我刚刚弄明白了,那是因为我把dry_run
设置成了True
。
根据这个:,如果 dry_run 设置为 True,它只评估查询而实际上 运行 作业。
花了我 5 个小时的时间。 :(
我正在尝试将 select 语句结果导出到另一个 table 作为永久存储。但是,当创建新的 table 时,它是无模式的。当我尝试查询该结果时 table 显示错误:
Table project-id.dataset_name.temp_table does not have a schema.
这是我的代码,用于将 SELECT 语句的结果导出到临时选项卡
def query_to_table():
service_account_info = {} # account info
credentials = Credentials.from_service_account_info(
service_account_info)
client = bigquery.Client(
project=service_account_info.get("project_id"),
credentials=credentials)
query = """
SELECT
a,
b
FROM `project.dataset.table`
WHERE a NOT IN ('error', 'warning')
"""
destination_dataset = client.dataset("abc_123") #this is another dataset
destination_table = destination_dataset.table("temp_table") # destination table
try:
client.get_table(destination_table)
client.delete_table(destination_table)
except Exception as e:
# Some logging
pass
client.create_table(Table(destination_table))
# Execute the job and save to table
job_config = bigquery.QueryJobConfig()
job_config.allow_large_results = True
job_config.use_legacy_sql = False
job_config.destination = destination_table
job_config.dry_run = True
query_job = client.query(query, job_config=job_config)
# Wait till the job done
while not query_job.done():
time.sleep(1)
logging.info(f"Processed {query_job.total_bytes_processed} bytes.")
return destination_table
哪里错了? Google 云方面是否有任何 API 变化? 因为这个脚本在一个月前就开始工作了。
请帮忙。
妈的!我刚刚弄明白了,那是因为我把dry_run
设置成了True
。
根据这个:
花了我 5 个小时的时间。 :(