当我尝试在同一管道执行时创建不同的 BigQuery 表时出错
Error when I try to create different BigQuery tables at the same pipeline execution
我有一个管道执行,代码如下:
PCollection<TableRow> test1 = ...
test1
.apply(BigQueryIO.Write
.named("test1 write")
.to("project_name:dataset_name.test1")
.withSchema(tableSchema)
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND));
PCollection<TableRow> test2 = ...
test2
.apply(BigQueryIO.Write
.named("test2 write")
.to("project_name:dataset_name.test2")
.withSchema(tableSchema)
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND));
如果我执行管道并且 table "test1" 和 "test2" 都不存在,我将获得以下信息:
jun 09, 2015 12:29:24 PM com.google.cloud.dataflow.sdk.util.BigQueryTableInserter tryCreateTable
INFORMACIÓN: Trying to create BigQuery table: project_name:dataset_name.test1
jun 09, 2015 12:29:27 PM com.google.cloud.dataflow.sdk.util.RetryHttpRequestInitializer$LoggingHttpBackoffUnsuccessfulResponseHandler handleResponse
ADVERTENCIA: Request failed with code 404, will NOT retry: https://www.googleapis.com/bigquery/v2/projects/pragmatic-armor-455/datasets/audit/tables/project_name:dataset_name.test2/insertAll
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
{
"code" : 404,
"errors" : [ {
"domain" : "global",
"message" : "Not found: Table project_name:dataset_name.test2",
"reason" : "notFound"
} ],
"message" : "Not found: Table project_name:dataset_name.test2"
}
为什么只创建了第一个 table?
提前致谢。
感谢您报告此事。原因是 BigQueryIO 中的一个错误导致第二个 table 偶尔不会被创建。此错误现已在 github 中通过 this 提交修复。该修复程序将在本月晚些时候推送给 Maven。抱歉给您带来麻烦!
我有一个管道执行,代码如下:
PCollection<TableRow> test1 = ...
test1
.apply(BigQueryIO.Write
.named("test1 write")
.to("project_name:dataset_name.test1")
.withSchema(tableSchema)
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND));
PCollection<TableRow> test2 = ...
test2
.apply(BigQueryIO.Write
.named("test2 write")
.to("project_name:dataset_name.test2")
.withSchema(tableSchema)
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND));
如果我执行管道并且 table "test1" 和 "test2" 都不存在,我将获得以下信息:
jun 09, 2015 12:29:24 PM com.google.cloud.dataflow.sdk.util.BigQueryTableInserter tryCreateTable
INFORMACIÓN: Trying to create BigQuery table: project_name:dataset_name.test1
jun 09, 2015 12:29:27 PM com.google.cloud.dataflow.sdk.util.RetryHttpRequestInitializer$LoggingHttpBackoffUnsuccessfulResponseHandler handleResponse
ADVERTENCIA: Request failed with code 404, will NOT retry: https://www.googleapis.com/bigquery/v2/projects/pragmatic-armor-455/datasets/audit/tables/project_name:dataset_name.test2/insertAll
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
{
"code" : 404,
"errors" : [ {
"domain" : "global",
"message" : "Not found: Table project_name:dataset_name.test2",
"reason" : "notFound"
} ],
"message" : "Not found: Table project_name:dataset_name.test2"
}
为什么只创建了第一个 table?
提前致谢。
感谢您报告此事。原因是 BigQueryIO 中的一个错误导致第二个 table 偶尔不会被创建。此错误现已在 github 中通过 this 提交修复。该修复程序将在本月晚些时候推送给 Maven。抱歉给您带来麻烦!