使用 load_table_from_dataframe 方法将数据写入 BigQuery table 错误 - 'str' 对象没有属性 'to_api_repr'
Write Data to BigQuery table using load_table_from_dataframe method ERROR - 'str' object has no attribute 'to_api_repr'
我正在尝试从云存储中读取数据并将数据写入 BigQuery table。使用 Pandas 库从 GCS 读取数据并使用 client.load_table_from_dataframe 方法写入数据。我在 Google 云作曲家中作为 python 运算符执行此代码。执行代码时出现以下错误。
[2020-06-23 17:09:36,119] {taskinstance.py:1059} ERROR - 'str' object has no attribute 'to_api_repr'@-@{"workflow": "DataTransformationSample1", "task-id": "dag_init", "execution-date": "2020-06-23T17:03:42.202219+00:00"}
Traceback (most recent call last):
File "/usr/local/lib/airflow/airflow/models/taskinstance.py", line 930, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/airflow/airflow/operators/python_operator.py", line 113, in execute
return_value = self.execute_callable()
File "/usr/local/lib/airflow/airflow/operators/python_operator.py", line 118, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/home/airflow/gcs/dags/DataTransformationSample1.py", line 225, in dag_initialization
destination=table_id, job_config=job_config)
File "/opt/python3.6/lib/python3.6/site-packages/google/cloud/bigquery/client.py", line 968, in load_table_from_dataframe
job_config=job_config,
File "/opt/python3.6/lib/python3.6/site-packages/google/cloud/bigquery/client.py", line 887, in load_table_from_file
job_resource = load_job._build_resource()
File "/opt/python3.6/lib/python3.6/site-packages/google/cloud/bigquery/job.py", line 1379, in _build_resource
self.destination.to_api_repr())
AttributeError: 'str' object has no attribute 'to_api_repr'
[2020-06-23 17:09:36,122] {base_task_runner.py:115} INFO - Job 202544: Subtask dag_init [2020-06-23 17:09:36,119] {taskinstance.py:1059} ERROR - 'str' object has no attribute 'to_api_repr'@-@{"workflow": "DataTransformationSample1", "task-id": "dag_init", "execution-date": "2020-06-23T17:03:42.202219+00:00"}
下面是我使用的代码,
client = bigquery.Client()
table_id = 'project.dataset.table'
job_config = bigquery.LoadJobConfig()
job_config.schema = [
bigquery.SchemaField(name="Code", field_type="STRING", mode="NULLABLE"),
bigquery.SchemaField(name="Value", field_type="STRING", mode="NULLABLE")
]
job_config.create_disposition = "CREATE_IF_NEEDED"
job_config.write_disposition = "WRITE_TRUNCATE"
load_result = client.load_table_from_dataframe(dataframe=concatenated_df,
destination=table_id, job_config=job_config)
load_result.result()
有人帮忙解决这个问题
基本上Panda 将字符串视为对象,但BigQuery 不知道。我们需要使用 Panda 将对象显式转换为字符串,以便将数据加载到 BQ table.
df[列名] = df[列名].astype(str)
我正在尝试从云存储中读取数据并将数据写入 BigQuery table。使用 Pandas 库从 GCS 读取数据并使用 client.load_table_from_dataframe 方法写入数据。我在 Google 云作曲家中作为 python 运算符执行此代码。执行代码时出现以下错误。
[2020-06-23 17:09:36,119] {taskinstance.py:1059} ERROR - 'str' object has no attribute 'to_api_repr'@-@{"workflow": "DataTransformationSample1", "task-id": "dag_init", "execution-date": "2020-06-23T17:03:42.202219+00:00"}
Traceback (most recent call last):
File "/usr/local/lib/airflow/airflow/models/taskinstance.py", line 930, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/airflow/airflow/operators/python_operator.py", line 113, in execute
return_value = self.execute_callable()
File "/usr/local/lib/airflow/airflow/operators/python_operator.py", line 118, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/home/airflow/gcs/dags/DataTransformationSample1.py", line 225, in dag_initialization
destination=table_id, job_config=job_config)
File "/opt/python3.6/lib/python3.6/site-packages/google/cloud/bigquery/client.py", line 968, in load_table_from_dataframe
job_config=job_config,
File "/opt/python3.6/lib/python3.6/site-packages/google/cloud/bigquery/client.py", line 887, in load_table_from_file
job_resource = load_job._build_resource()
File "/opt/python3.6/lib/python3.6/site-packages/google/cloud/bigquery/job.py", line 1379, in _build_resource
self.destination.to_api_repr())
AttributeError: 'str' object has no attribute 'to_api_repr'
[2020-06-23 17:09:36,122] {base_task_runner.py:115} INFO - Job 202544: Subtask dag_init [2020-06-23 17:09:36,119] {taskinstance.py:1059} ERROR - 'str' object has no attribute 'to_api_repr'@-@{"workflow": "DataTransformationSample1", "task-id": "dag_init", "execution-date": "2020-06-23T17:03:42.202219+00:00"}
下面是我使用的代码,
client = bigquery.Client()
table_id = 'project.dataset.table'
job_config = bigquery.LoadJobConfig()
job_config.schema = [
bigquery.SchemaField(name="Code", field_type="STRING", mode="NULLABLE"),
bigquery.SchemaField(name="Value", field_type="STRING", mode="NULLABLE")
]
job_config.create_disposition = "CREATE_IF_NEEDED"
job_config.write_disposition = "WRITE_TRUNCATE"
load_result = client.load_table_from_dataframe(dataframe=concatenated_df,
destination=table_id, job_config=job_config)
load_result.result()
有人帮忙解决这个问题
基本上Panda 将字符串视为对象,但BigQuery 不知道。我们需要使用 Panda 将对象显式转换为字符串,以便将数据加载到 BQ table.
df[列名] = df[列名].astype(str)