GCP 数据流提取 JOB_ID
GCP Dataflow extract JOB_ID
对于数据流作业,我需要从 JOB_NAME 中提取 Job_ID。我有以下命令和相应的 o/p。您能否指导如何从以下响应中提取 JOB_ID
$ gcloud dataflow jobs list --region=us-central1 --status=active --filter="name=sample-job"
JOB_ID NAME TYPE CREATION_TIME STATE REGION
2020-10-07_10_11_20-15879763245819496196 sample-job Streaming 2020-10-07 17:11:21 Running us-central1
要是能用Python脚本来实现就可以了
您可以使用标准命令行工具来解析该命令的响应,例如
gcloud dataflow jobs list --region=us-central1 --status=active --filter="name=sample-job" | tail -n 1 | cut -f 1 -d " "
或者,如果这已经来自 Python 程序,您可以直接使用数据流 API 而不是使用 gcloud 工具,如
使用 python,您可以通过对数据流方法 https://dataflow.googleapis.com/v1b3/projects/{projectId}/jobs
的 REST 请求检索 jobs' list
然后,可以解析 json 响应以使用 if 子句过滤您正在搜索的职位名称:
if job["name"] == 'sample-job'
我测试了这个方法并且有效:
import requests
import json
base_url = 'https://dataflow.googleapis.com/v1b3/projects/'
project_id = '<MY_PROJECT_ID>'
location = '<REGION>'
response = requests.get(f'{base_url}{project_id}/locations/{location}/jobs', headers = {'Authorization':'Bearer <BEARER_TOKEN_HERE>'})
# <BEARER_TOKEN_HERE> can be retrieved with 'gcloud auth print-access-token' obtained with an account that has access to Dataflow jobs.
# Another authentication mechanism can be found in the link provided by danielm
jobslist = response.json()
for key,jobs in jobslist.items():
for job in jobs:
if job["name"] == 'beamapp-0907191546-413196':
print(job["name"]," Found, job ID:",job["id"])
else:
print(job["name"]," Not matched")
# Output:
# windowedwordcount-0908012420-bd342f98 Not matched
# beamapp-0907200305-106040 Not matched
# beamapp-0907192915-394932 Not matched
# beamapp-0907191546-413196 Found, job ID: 2020-09-07...154989572
用 Python script 创建了我的 GIST 来实现它。
gcloud dataflow jobs list --region=us-central1 --status=active --filter="name=sample-job" --format="value(JOB_ID)"
对于数据流作业,我需要从 JOB_NAME 中提取 Job_ID。我有以下命令和相应的 o/p。您能否指导如何从以下响应中提取 JOB_ID
$ gcloud dataflow jobs list --region=us-central1 --status=active --filter="name=sample-job"
JOB_ID NAME TYPE CREATION_TIME STATE REGION
2020-10-07_10_11_20-15879763245819496196 sample-job Streaming 2020-10-07 17:11:21 Running us-central1
要是能用Python脚本来实现就可以了
您可以使用标准命令行工具来解析该命令的响应,例如
gcloud dataflow jobs list --region=us-central1 --status=active --filter="name=sample-job" | tail -n 1 | cut -f 1 -d " "
或者,如果这已经来自 Python 程序,您可以直接使用数据流 API 而不是使用 gcloud 工具,如
使用 python,您可以通过对数据流方法 https://dataflow.googleapis.com/v1b3/projects/{projectId}/jobs
然后,可以解析 json 响应以使用 if 子句过滤您正在搜索的职位名称:
if job["name"] == 'sample-job'
我测试了这个方法并且有效:
import requests
import json
base_url = 'https://dataflow.googleapis.com/v1b3/projects/'
project_id = '<MY_PROJECT_ID>'
location = '<REGION>'
response = requests.get(f'{base_url}{project_id}/locations/{location}/jobs', headers = {'Authorization':'Bearer <BEARER_TOKEN_HERE>'})
# <BEARER_TOKEN_HERE> can be retrieved with 'gcloud auth print-access-token' obtained with an account that has access to Dataflow jobs.
# Another authentication mechanism can be found in the link provided by danielm
jobslist = response.json()
for key,jobs in jobslist.items():
for job in jobs:
if job["name"] == 'beamapp-0907191546-413196':
print(job["name"]," Found, job ID:",job["id"])
else:
print(job["name"]," Not matched")
# Output:
# windowedwordcount-0908012420-bd342f98 Not matched
# beamapp-0907200305-106040 Not matched
# beamapp-0907192915-394932 Not matched
# beamapp-0907191546-413196 Found, job ID: 2020-09-07...154989572
用 Python script 创建了我的 GIST 来实现它。
gcloud dataflow jobs list --region=us-central1 --status=active --filter="name=sample-job" --format="value(JOB_ID)"