数据流错误 - "IOException: Failed to write to GCS path.." "Backend Error 500"

Dataflow errors - "IOException: Failed to write to GCS path.." "Backend Error 500"

我们的一个管道抛出以下错误。我们第一次见到它。我们 运行 来自 BigQuery table 的大约 6.25 亿行。作业仍然完成,并在控制台中记录为 "successful"。但是我们担心可能 Dataflow 无法写入 GCS(Dataflow 写入 GCS 然后加载到 BigQuery)的文件没有加载到 BigQuery,因此我们现在丢失了一些数据。

我们很难确定这些行是否已加载,因为我们要处理的数据量非常大。

有什么方法可以知道 Dataflow 是否加载了该文件?

职位编号:2015-05-27_18_21_21-8377993823053896089

2015-05-28T01:21:23.210Z: (c1e36887ebb5e3b3): Autoscaling: Enabled for job /workflows/wf-2015-05-27_18_21_21-8377993823053896089
2015-05-28T01:22:23.711Z: (45988c062ea96b38): Autoscaling: Resizing worker pool from 1 to 3.
2015-05-28T01:23:53.713Z: (45988c062ea96352): Autoscaling: Resizing worker pool from 3 to 12.
2015-05-28T01:25:23.715Z: (45988c062ea96b6c): Autoscaling: Resizing worker pool from 12 to 48.
2015-05-28T01:26:53.716Z: (45988c062ea96386): Autoscaling: Resizing worker pool from 48 to 64.
2015-05-28T01:48:48.863Z: (54b9f9ed2402c4e7): java.io.IOException: Failed to write to GCS path gs://<removed>/15697574167464387868/dax-tmp-2015-05-27_18_21_21-8377993823053896089-S09-1-731cba632206348a/-shard-00000-of-00001_C183_00000-of-00001-try-52ba464032d439ee-endshard.json.
    at com.google.cloud.dataflow.sdk.util.gcsio.GoogleCloudStorageWriteChannel.throwIfUploadFailed(GoogleCloudStorageWriteChannel.java:372)
    at com.google.cloud.dataflow.sdk.util.gcsio.GoogleCloudStorageWriteChannel.close(GoogleCloudStorageWriteChannel.java:270)
    at com.google.cloud.dataflow.sdk.runners.worker.TextSink$TextFileWriter.close(TextSink.java:243)
    at com.google.cloud.dataflow.sdk.util.common.worker.WriteOperation.finish(WriteOperation.java:100)
    at com.google.cloud.dataflow.sdk.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:74)
    at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.doWork(DataflowWorker.java:130)
    at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:95)
    at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:139)
    at com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerHarness$WorkerThread.call(DataflowWorkerHarness.java:124)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 410 Gone
{
  "code" : 500,
  "errors" : [ {
    "domain" : "global",
    "message" : "Backend Error",
    "reason" : "backendError"
  } ],
  "message" : "Backend Error"
}
    at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:145)
    at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
    at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:432)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
    at com.google.cloud.dataflow.sdk.util.gcsio.GoogleCloudStorageWriteChannel$UploadOperation.run(GoogleCloudStorageWriteChannel.java:166)
    ... 3 more

2015-05-28T01:48:53.870Z: (4aaf52256f502f1a): Failed task is going to be retried.
2015-05-28T02:00:49.444Z: S09: (aafd22d37feb496e): Unable to delete temporary files gs://<removed>/15697574167464387868/dax-tmp-2015-05-27_18_21_21-8377993823053896089-S09-1-731cba632206348a/@DAX.json$ Causes: (aafd22d37feb4227): Unable to delete directory: gs://<removed>/15697574167464387868/dax-tmp-2015-05-27_18_21_21-8377993823053896089-S09-1-731cba632206348a.

Dataflow 重试失败的任务(最多 4 次)。在这种情况下,错误看起来是暂时的,重试后任务成功。您的数据应该是完整的。