AzureML:ResolvePackageNotFound azureml-dataprep

AzureML: ResolvePackageNotFound azureml-dataprep

我的 AML 管道中有一个基本的 ScriptStep,它只是试图读取附加的数据集。当我执行这个简单的示例时,管道失败并在驱动程序日志中显示以下内容:

ImportError: azureml-dataprep is not installed. Dataset cannot be used without azureml-dataprep. Please make sure azureml-dataprep[fuse,pandas] is installed by specifying it in the conda dependencies. pandas is optional and should be only installed if you intend to create a pandas DataFrame from the dataset.

然后我修改了我的步骤以包含 conda 包,但随后驱动程序失败并显示 "ResolvePackageNotFound: azureml-dataprep"。可以访问整个日志文件 here.

# create a new runconfig object
run_config = RunConfiguration()
run_config.environment.docker.enabled = True
run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE
run_config.environment.python.user_managed_dependencies = False
run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['azureml-dataprep[pandas,fuse]'])

source_directory = './read-step'
print('Source directory for the step is {}.'.format(os.path.realpath(source_directory)))
step2 = PythonScriptStep(name="read_step",
                         script_name="Read.py", 
                         arguments=["--dataFilePath", dataset.as_named_input('local_ds').as_mount() ],
                         compute_target=aml_compute, 
                         source_directory=source_directory,
                         runconfig=run_config,
                         allow_reuse=False)

我没有想法,非常感谢这里的任何帮助!

azureml-sdk 在 conda 上不可用,您需要使用 pip 安装它。

myenv = Environment(name="myenv")
conda_dep = CondaDependencies().add_pip_package("azureml-dataprep[pandas,fuse]")
myenv.python.conda_dependencies=conda_dep
run_config.environment = myenv

有关此错误的更多信息,日志选项卡有一个名为 20_image_build_log.txt 的日志,它 Docker 构建日志。它包含 conda failed to failed to find azureml-dataprep

的错误

编辑:

很快,您将不必再指定此依赖项。 Azure Data4ML 团队表示 azureml-dataprep[pandas,fuse] 被添加为 azureml-defaults 的依赖项,它会自动安装在所有图像上。