访问 Cloud Data Fusion 实例的 CDAP Rest API
Access CDAP Rest API of a Cloud Data Fusion Instance
如何访问 Cloud Data Fusion 实例的 CDAP REST API?我想使用 Cloud Composer 来编排我的管道。
我有一个启用了私有 IP 的企业版实例,但我找不到任何关于如何访问 REST 的文档API。
实例详细信息页面仅显示 /22 IP 地址范围 - 它未指定特定 IP。我是否使用受 IAP 保护的 URL 访问 UI?
您可以使用 projects.locations.instances.list
方法获取 Data Fusion 实例的 CDAP API 端点。您可以使用 API Explorer
或 curl:
对其进行测试
PROJECT=$(gcloud config get-value project)
TOKEN=$(gcloud auth print-access-token)
LOCATION=europe-west4
curl -H "Authorization: Bearer $TOKEN" \
https://datafusion.googleapis.com/v1beta1/projects/$PROJECT/locations/$LOCATION/instances
{
"instances": [
{
"name": "projects/PROJECT/locations/europe-west4/instances/data-fusion-1",
"type": "BASIC",
"networkConfig": {},
"createTime": "2019-11-10T12:02:55.776479620Z",
"updateTime": "2019-11-10T12:16:41.560477044Z",
"state": "RUNNING",
"serviceEndpoint": "https://data-fusion-1-PROJECT-dot-euw4.datafusion.googleusercontent.com",
"version": "6.1.0.2",
"serviceAccount": "cloud-datafusion-management-sa@REDACTED-tp.iam.gserviceaccount.com",
"displayName": "data-fusion-1",
"apiEndpoint": "https://data-fusion-1-PROJECT-dot-euw4.datafusion.googleusercontent.com/api"
}
]
}
请注意 apiEndpoint
的形式为:
https://<INSTANCE_DISPLAY_NAME>-<PROJECT_ID>-dot-<REGION_ACRONYM>.datafusion.googleusercontent.com/api
现在,我们可以按照 CDAP 参考指南查看,例如,一个管道的 run history:
GET hostname/api/v3/namespaces/namespace-id/apps/pipeline-name/workflows/DataPipelineWorkflow/runs
其中 hostname
是之前获得的 serviceEndpoint
,对于 BASIC
实例,namespace-id
将是 default
(对于 Enterprise,您可以有不同的命名空间)在我的例子中 pipeline-name
将是 BQ-to-GCS
:
curl -H "Authorization: Bearer $TOKEN" \
https://data-fusion-1-$PROJECT-dot-euw4.datafusion.googleusercontent.com/api/v3/namespaces/default/apps/BQ-to-GCS/workflows/DataPipelineWorkflow/runs
[{"runid":"REDACTED","starting":1573395214,"start":1573395401,"end":1573395492,"status":"COMPLETED",
"properties":{"runtimeArgs":"{\"logical.start.time\":\"1573395214003\",\"system.profile.name\":\"SYSTEM:dataproc\"}",
"phase-1":"b8f5c7d1-03c4-11ea-a553-42010aa40019"},"cluster":{"status":"DEPROVISIONED","end":1573395539,"numNodes":3},
"profile":{"profileName":"dataproc","namespace":"system","entity":"PROFILE"}}]]
现在也有 Cloud composer 的 Operators 可以对 Data Fusion 进行 API 调用。这使它变得简单得多。 Link to operators.
在 Cloud Composer DAG 中编排 Data Fusion 管道启动的示例:
start_pipeline = CloudDataFusionStartPipelineOperator(
location=LOCATION,
pipeline_name=PIPELINE_NAME,
instance_name=INSTANCE_NAME,
task_id="start_pipeline",
)
如何访问 Cloud Data Fusion 实例的 CDAP REST API?我想使用 Cloud Composer 来编排我的管道。
我有一个启用了私有 IP 的企业版实例,但我找不到任何关于如何访问 REST 的文档API。
实例详细信息页面仅显示 /22 IP 地址范围 - 它未指定特定 IP。我是否使用受 IAP 保护的 URL 访问 UI?
您可以使用 projects.locations.instances.list
方法获取 Data Fusion 实例的 CDAP API 端点。您可以使用 API Explorer
或 curl:
PROJECT=$(gcloud config get-value project)
TOKEN=$(gcloud auth print-access-token)
LOCATION=europe-west4
curl -H "Authorization: Bearer $TOKEN" \
https://datafusion.googleapis.com/v1beta1/projects/$PROJECT/locations/$LOCATION/instances
{
"instances": [
{
"name": "projects/PROJECT/locations/europe-west4/instances/data-fusion-1",
"type": "BASIC",
"networkConfig": {},
"createTime": "2019-11-10T12:02:55.776479620Z",
"updateTime": "2019-11-10T12:16:41.560477044Z",
"state": "RUNNING",
"serviceEndpoint": "https://data-fusion-1-PROJECT-dot-euw4.datafusion.googleusercontent.com",
"version": "6.1.0.2",
"serviceAccount": "cloud-datafusion-management-sa@REDACTED-tp.iam.gserviceaccount.com",
"displayName": "data-fusion-1",
"apiEndpoint": "https://data-fusion-1-PROJECT-dot-euw4.datafusion.googleusercontent.com/api"
}
]
}
请注意 apiEndpoint
的形式为:
https://<INSTANCE_DISPLAY_NAME>-<PROJECT_ID>-dot-<REGION_ACRONYM>.datafusion.googleusercontent.com/api
现在,我们可以按照 CDAP 参考指南查看,例如,一个管道的 run history:
GET hostname/api/v3/namespaces/namespace-id/apps/pipeline-name/workflows/DataPipelineWorkflow/runs
其中 hostname
是之前获得的 serviceEndpoint
,对于 BASIC
实例,namespace-id
将是 default
(对于 Enterprise,您可以有不同的命名空间)在我的例子中 pipeline-name
将是 BQ-to-GCS
:
curl -H "Authorization: Bearer $TOKEN" \
https://data-fusion-1-$PROJECT-dot-euw4.datafusion.googleusercontent.com/api/v3/namespaces/default/apps/BQ-to-GCS/workflows/DataPipelineWorkflow/runs
[{"runid":"REDACTED","starting":1573395214,"start":1573395401,"end":1573395492,"status":"COMPLETED",
"properties":{"runtimeArgs":"{\"logical.start.time\":\"1573395214003\",\"system.profile.name\":\"SYSTEM:dataproc\"}",
"phase-1":"b8f5c7d1-03c4-11ea-a553-42010aa40019"},"cluster":{"status":"DEPROVISIONED","end":1573395539,"numNodes":3},
"profile":{"profileName":"dataproc","namespace":"system","entity":"PROFILE"}}]]
现在也有 Cloud composer 的 Operators 可以对 Data Fusion 进行 API 调用。这使它变得简单得多。 Link to operators.
在 Cloud Composer DAG 中编排 Data Fusion 管道启动的示例:
start_pipeline = CloudDataFusionStartPipelineOperator(
location=LOCATION,
pipeline_name=PIPELINE_NAME,
instance_name=INSTANCE_NAME,
task_id="start_pipeline",
)