如何安全地授予 Cloud Scheduler 创建 Dataflow 作业的权限?

How do I safely grant permission for Cloud Scheduler to create a Dataflow job?

我有一个 Dataflow 模板,可用于 Dataflow 作业 运行宁作为我选择的服务帐户。我实际上使用了 Google 提供的样本之一:gs://dataflow-templates/latest/GCS_Text_to_BigQuery.

我现在想使用 Cloud Scheduler 来安排它。我已经像这样设置了我的调度程序作业:

当调度程序作业 运行 出现错误 PERMISSION_DENIED:

{
  "insertId": "1kw7uaqg3tnzbqu",
  "jsonPayload": {
    "@type": "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished",
    "url": "https://dataflow.googleapis.com/v1b3/projects/project-redacted/locations/europe-west2/templates:launch?gcsPath=gs%3A%2F%2Fdataflow-templates%2Flatest%2FGCS_Text_to_BigQuery",
    "jobName": "projects/project-redacted/locations/europe-west2/jobs/aaa-schedule-dataflow-job",
    "status": "PERMISSION_DENIED",
    "targetType": "HTTP"
  },
  "httpRequest": {
    "status": 403
  },
  "resource": {
    "type": "cloud_scheduler_job",
    "labels": {
      "job_id": "aaa-schedule-dataflow-job",
      "project_id": "project-redacted",
      "location": "europe-west2"
    }
  },
  "timestamp": "2021-12-16T16:41:17.349974291Z",
  "severity": "ERROR",
  "logName": "projects/project-redacted/logs/cloudscheduler.googleapis.com%2Fexecutions",
  "receiveTimestamp": "2021-12-16T16:41:17.349974291Z"
}

我不知道缺少什么权限或我需要授予什么才能完成这项工作,希望这里有人可以帮助我。

为了重现该问题,我构建了一个 terraform 配置,该配置从模板及其所有先决条件创建数据流作业并成功执行。

在相同的 terraform 配置中,我创建了一个 Cloud Scheduler 作业,该作业声称执行相同的 Dataflow 作业,但它因上述错误而失败。

所有这些代码都可以在 https://github.com/jamiet-msm/dataflow-scheduler-permission-problem/tree/6ef20824af0ec798634c146ee9073b4b40c965e0 获得,我已经创建了一个解释如何 运行 它的自述文件:

我明白了,服务帐户需要被授予 roles/iam.serviceAccountUser 自身

resource "google_service_account_iam_member" "sa_may_act_as_itself" {
  service_account_id = google_service_account.sa.name
  role               = "roles/iam.serviceAccountUser"
  member             = "serviceAccount:${google_service_account.sa.email}"
}

和 roles/dataflow.admin 也是必需的,roles/dataflow.worker 是不够的。我认为这是因为 dataflow.jobs.create 是必需的,而 roles/dataflow.worker 没有提供(请参阅 https://cloud.google.com/dataflow/docs/concepts/access-control#roles 以供参考)

resource "google_project_iam_member" "df_admin" {
  role   = "roles/dataflow.admin"
  member = "serviceAccount:${google_service_account.sa.email}"
}

这是具有所需更改的提交:https://github.com/jamiet-msm/dataflow-scheduler-permission-problem/commit/3fd7cabdf13d5465e01a928049f54b0bd486ed73