数据流僵尸作业 - 卡在 "Not Started" 状态

Dataflow zombie jobs - stuck in "Not Started" state

我们所有的数据流作业突然停止工作。他们现在只显示 "Not started".

当我们开始一项工作时,它实际上似乎催生了许多其他工作,这些工作都只是悬而未决。

服务有问题吗?

职位编号列表:

  1. 2015-05-12_04_15_09-9449594780471772631
  2. 2015-05-12_04_11_43-2832089474782567234
  3. 2015-05-12_04_11_10-7703117482304158028
  4. 2015-05-12_04_06_52-8133922783285731870
  5. 2015-05-12_04_06_09-14187812688860505584
  6. 2015-05-12_04_05_32-10296794562342944020
  7. 2015-05-12_04_04_58-17815218306022481742
  8. 2015-05-12_04_04_26-1948202417139012084
  9. 2015-05-12_04_03_55-5718237782405777885
  10. 2015-05-12_04_03_23-8040675812721773662

44227 [main] INFO  com.google.cloud.dataflow.sdk.util.PackageUtil  - Uploading PipelineOptions.filesToStage complete: 1 files newly uploaded, 77 files cached
Dataflow SDK version: 0.4.150414
446168 [main] WARN  com.google.cloud.dataflow.sdk.util.RetryHttpRequestInitializer  - Request failed with code 429, will NOT retry: https://dataflow.googleapis.com/v1b3/projects/gdfp-xxx/jobs
Disconnected from the target VM, address: '127.0.0.1:54217', transport: 'socket'
446171 [main] ERROR com.tls.cdf.dfp.DFPDenormalizationCloudDataFlowJob  - Exception encountered while trying to execute "DFP Denormalization Job"
java.lang.RuntimeException: Failed to create a workflow job: (40153232ba863405): The workflow could not be created. Please try again in a few minutes. If you are still unable to create a job please contact customer support. Causes: (40153232ba8632a6): Your job could not be created. Please try again in a few minutes. If the service still isn't working please contact customer support. Causes: Internal Issue (7a518e51908b45c2): 64605561:22202 Causes: (33edae1682908f81): Too many running jobs. Project gdfp-xxxx is running 10 workflows and project limit for active workflows is 10. To fix this, cancel an existing workflow via the UI, wait for a workflow to finish or contact dataflow-feedback@google.com to request an increase in quota.
    at com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner.run(DataflowPipelineRunner.java:221)
    at com.google.cloud.dataflow.sdk.runners.BlockingDataflowPipelineRunner.run(BlockingDataflowPipelineRunner.java:81)
    at com.google.cloud.dataflow.sdk.runners.BlockingDataflowPipelineRunner.run(BlockingDataflowPipelineRunner.java:47)
    at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:145)
    at com.tls.cdf.job.AbstractCloudDataFlowJob.execute(AbstractCloudDataFlowJob.java:100)
    at com.tls.cdf.CloudDataFlowJobExecutor.main(CloudDataFlowJobExecutor.java:44)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 429 Too Many Requests
{
  "code" : 429,
  "errors" : [ {
    "domain" : "global",
    "message" : "(40153232ba863405): The workflow could not be created. Please try again in a few minutes. If you are still unable to create a job please contact customer support. Causes: (40153232ba8632a6): Your job could not be created. Please try again in a few minutes. If the service still isn't working please contact customer support. Causes: Internal Issue (7a518e51908b45c2): 64605561:22202 Causes: (33edae1682908f81): Too many running jobs. Project gdfp-xxxx is running 10 workflows and project limit for active workflows is 10. To fix this, cancel an existing workflow via the UI, wait for a workflow to finish or contact dataflow-feedback@google.com to request an increase in quota.",
    "reason" : "rateLimitExceeded"
  } ],
  "message" : "(40153232ba863405): The workflow could not be created. Please try again in a few minutes. If you are still unable to create a job please contact customer support. Causes: (40153232ba8632a6): Your job could not be created. Please try again in a few minutes. If the service still isn't working please contact customer support. Causes: Internal Issue (7a518e51908b45c2): 64605561:22202 Causes: (33edae1682908f81): Too many running jobs. Project gdfp-xxxx is running 10 workflows and project limit for active workflows is 10. To fix this, cancel an existing workflow via the UI, wait for a workflow to finish or contact dataflow-feedback@google.com to request an increase in quota.",
  "status" : "RESOURCE_EXHAUSTED"
}
    at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:145)
    at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
    at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.interceptResponse(AbstractGoogleClientRequest.java:321)
    at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1056)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
    at com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner.run(DataflowPipelineRunner.java:217)
    ... 5 more

它又开始工作了。似乎是 Dataflow 服务本身的问题。