AzureDevOPS ML Error: We could not find config.json in: /home/vsts/work/1/s or in its parent directories

Question

我正在尝试创建 Azure DEVOPS ML 管道。以下代码在 Jupyter 笔记本上 100% 正常工作，但是当我运行它在 Azure Devops 中时，我收到此错误：

Traceback (most recent call last):
  File "src/my_custom_package/data.py", line 26, in <module>
    ws = Workspace.from_config()
  File "/opt/hostedtoolcache/Python/3.8.7/x64/lib/python3.8/site-packages/azureml/core/workspace.py", line 258, in from_config
    raise UserErrorException('We could not find config.json in: {} or in its parent directories. '
azureml.exceptions._azureml_exception.UserErrorException: UserErrorException:
    Message: We could not find config.json in: /home/vsts/work/1/s or in its parent directories. Please provide the full path to the config file or ensure that config.json exists in the parent directories.
    InnerException None
    ErrorResponse 
{
    "error": {
        "code": "UserError",
        "message": "We could not find config.json in: /home/vsts/work/1/s or in its parent directories. Please provide the full path to the config file or ensure that config.json exists in the parent directories."
    }
}

密码是：

#import
from sklearn.model_selection import train_test_split
from azureml.core.workspace import Workspace
from azureml.train.automl import AutoMLConfig
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.core.experiment import Experiment
from datetime import date
from azureml.core import Workspace, Dataset



import pandas as pd
import numpy as np
import logging

#getdata
subscription_id = 'mysubid'
resource_group = 'myrg'
workspace_name = 'mlplayground'
workspace = Workspace(subscription_id, resource_group, workspace_name)
dataset = Dataset.get_by_name(workspace, name='correctData')


#auto ml
ws = Workspace.from_config()


automl_settings = {
    "iteration_timeout_minutes": 2880,
    "experiment_timeout_hours": 48,
    "enable_early_stopping": True,
    "primary_metric": 'spearman_correlation',
    "featurization": 'auto',
    "verbosity": logging.INFO,
    "n_cross_validations": 5,
    "max_concurrent_iterations": 4,
    "max_cores_per_iteration": -1,
}



cpu_cluster_name = "computecluster"
compute_target = ComputeTarget(workspace=ws, name=cpu_cluster_name)
print(compute_target)
automl_config = AutoMLConfig(task='regression',
                             compute_target = compute_target,
                             debug_log='automated_ml_errors.log',
                             training_data = dataset,
                             label_column_name="paidInDays",
                             **automl_settings)

today = date.today()
d4 = today.strftime("%b-%d-%Y")

experiment = Experiment(ws, "myexperiment"+d4)
remote_run = experiment.submit(automl_config, show_output = True)

from azureml.widgets import RunDetails
RunDetails(remote_run).show()

remote_run.wait_for_completion()

Answer 1

您需要提供 Workspace.from_config() 的配置路径。在 https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.workspace.workspace?view=azure-ml-py 下，您可以找到以下关于如何创建配置文件的说明：创建工作区：

from azureml.core import Workspace
ws = Workspace.create(name='myworkspace',
           subscription_id='<azure-subscription-id>',
           resource_group='myresourcegroup',
           create_resource_group=True,
           location='eastus2'
           )

保存工作区配置：

ws.write_config(path="./file-path", file_name="config.json")

从默认路径加载配置：

ws = Workspace.from_config()
ws.get_details()

或从指定路径加载配置：

ws = Workspace.from_config(path="my/path/config.json")

有关如何创建工作区的更多详细信息 from_config 可在此处找到： https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.workspace.workspace?view=azure-ml-py#from-config-path-none--auth-none---logger-none---file-name-none-

Answer 2

您的代码发生了一些奇怪的事情，您正在从第一个工作区 (workspace = Workspace(subscription_id, resource_group, workspace_name)) 获取数据，然后使用第二个工作区 (ws = Workspace.from_config()) 的资源。我建议避免让代码依赖于两个不同的工作区，尤其是当您知道一个基础数据源可以注册（链接）到多个工作区时 (documentation)。

通常在实例化 Workspace 对象时使用 config.json 文件将导致交互式身份验证。当您的代码将被处理时，您将有一个日志，要求您到达特定的 URL 并输入代码。这将使用您的 Microsoft 帐户来验证您是否有权访问 Azure 资源（在本例中为您的 Workspace('mysubid', 'myrg', 'mlplayground')）。当您开始将代码部署到虚拟机或代理上时，这有其局限性，您不会总是手动检查日志、访问 URL 并验证自己。

为此，强烈建议设置更高级的身份验证方法，我个人建议使用服务主体，因为如果操作得当，它简单、方便且安全。大家可以按照Azure的官方文档here.