将 PythonScriptStep 内部的指标记录到父 PipelineRun

log metric from inside PythonScriptStep to parent PipelineRun

SDK 版本1.0.43

为了尽量减少点击并比较 PipelineRun 之间的准确性,我想将 PythonScriptStep 内部的指标记录到父 PipelineRun。我以为我可以这样做:

from azureml.core import Run
run = Run.get_context()
foo = 0.80
run.parent.log("accuracy",foo)

但是我得到了这个错误。

Traceback (most recent call last):
  File "get_metrics.py", line 62, in <module>
    run.parent.log("geo_mean", top3_runs)
  File "/azureml-envs/azureml_ffecfef6fbfa1d89f72d5af22e52c081/lib/python3.6/site-packages/azureml/core/run.py", line 459, in parent
    return None if parent_run_id is None else get_run(self.experiment, parent_run_id)
  File "/azureml-envs/azureml_ffecfef6fbfa1d89f72d5af22e52c081/lib/python3.6/site-packages/azureml/core/run.py", line 1713, in get_run
    return next(runs)
  File "/azureml-envs/azureml_ffecfef6fbfa1d89f72d5af22e52c081/lib/python3.6/site-packages/azureml/core/run.py", line 297, in _rehydrate_runs
    yield factory(experiment, run_dto)
  File "/azureml-envs/azureml_ffecfef6fbfa1d89f72d5af22e52c081/lib/python3.6/site-packages/azureml/pipeline/core/run.py", line 325, in _from_dto
    return PipelineRun(experiment=experiment, run_id=run_dto.run_id)
  File "/azureml-envs/azureml_ffecfef6fbfa1d89f72d5af22e52c081/lib/python3.6/site-packages/azureml/pipeline/core/run.py", line 74, in __init__
    service_endpoint=_service_endpoint)
  File "/azureml-envs/azureml_ffecfef6fbfa1d89f72d5af22e52c081/lib/python3.6/site-packages/azureml/pipeline/core/_graph_context.py", line 46, in __init__
    service_endpoint=service_endpoint)
  File "/azureml-envs/azureml_ffecfef6fbfa1d89f72d5af22e52c081/lib/python3.6/site-packages/azureml/pipeline/core/_aeva_provider.py", line 118, in create_provider
    service_endpoint=service_endpoint)
  File "/azureml-envs/azureml_ffecfef6fbfa1d89f72d5af22e52c081/lib/python3.6/site-packages/azureml/pipeline/core/_aeva_provider.py", line 133, in create_service_caller
    service_endpoint = _AevaWorkflowProvider.get_endpoint_url(workspace, experiment_name)
  File "/azureml-envs/azureml_ffecfef6fbfa1d89f72d5af22e52c081/lib/python3.6/site-packages/azureml/pipeline/core/_aeva_provider.py", line 153, in get_endpoint_url
    workspace_name=workspace.name, workspace_id=workspace._workspace_id)
  File "/azureml-envs/azureml_ffecfef6fbfa1d89f72d5af22e52c081/lib/python3.6/site-packages/azureml/core/workspace.py", line 749, in _workspace_id
    self.get_details()
  File "/azureml-envs/azureml_ffecfef6fbfa1d89f72d5af22e52c081/lib/python3.6/site-packages/azureml/core/workspace.py", line 594, in get_details
    self._subscription_id)
  File "/azureml-envs/azureml_ffecfef6fbfa1d89f72d5af22e52c081/lib/python3.6/site-packages/azureml/_project/_commands.py", line 507, in show_workspace
    AzureMachineLearningWorkspaces, subscription_id).workspaces,
  File "/azureml-envs/azureml_ffecfef6fbfa1d89f72d5af22e52c081/lib/python3.6/site-packages/azureml/core/authentication.py", line 112, in _get_service_client
    all_subscription_list, tenant_id = self._get_all_subscription_ids()
TypeError: 'NoneType' object is not iterable

更新

在进一步调查中,我尝试使用下面的行打印 运行 的 parent 属性并得到相同的 Traceback

print("print run parent attribute", run.parent)

下面的get_properties()方法。我猜 azureml 只是将 azureml.pipelinerunid 属性 用于管道树层次结构,而 parent 属性已留给任何用户定义的层次结构。

{
    "azureml.runsource": "azureml.StepRun",
    "ContentSnapshotId": "45bdecd3-1c43-48da-af5c-c95823c407e0",
    "StepType": "PythonScriptStep",
    "ComputeTargetType": "AmlCompute",
    "azureml.pipelinerunid": "e523d575-c373-46d2-a4bc-1717f5e34ec2",
    "_azureml.ComputeTargetType": "batchai",
    "AzureML.DerivedImageName": "azureml/azureml_dfd7f4f952ace529f986fe919909c3ec"
}

是否定义了运行?你能试试吗...

run = Run.get_context()
run.parent.log('metric1', 0.80)

我无法在 1.0.60(最新)或 1.0.43(如 post)上重现

我可以 run.parent.log()

run = Run.get_context()
run.parent.log('metric1', 0.80)

步步为营 然后我可以做

r = Run(ws.experiments["expName"], runid)
r.get_metrics()

并且可以看到管道上的指标 运行。

不清楚是否遗漏了什么。

请将您的SDK升级到最新版本。似乎这个问题在 1.0.43 之后的某个时候得到了修复。