"Failure Exception: OSError: [Errno 30] Read-only file system" when using AzureML in Python Azure Function
"Failure Exception: OSError: [Errno 30] Read-only file system" when using AzureML in Python Azure Function
问题
我正在尝试准备,然后从 Python 中的 Azure 函数向 Azure 机器学习提交新实验。因此,我为我的 Azure ML 工作区注册了一个新数据集,其中包含使用 dataset.register(...
的 ML 模型的训练数据。但是,当我尝试使用以下代码行创建此数据集时
dataset = Dataset.Tabular.from_delimited_files(path = datastore_paths)
然后我得到一个 Failure Exception: OSError: [Errno 30] Read-only file system ...
。
想法
- 我知道如果可能的话,我不应该从 Azure 函数中写入文件系统。但我实际上不想向本地文件系统写入任何内容。我只想创建数据集作为对
datastore_path
下我的 blob 存储的引用,然后将其注册到我的 Azure 机器学习工作区。但似乎方法 from_delimited_files
无论如何都在尝试写入文件系统(也许是一些缓存?)。
- 我也知道有一个temp文件夹,里面允许写入临时文件。但是,我相信我无法真正控制此方法在何处写入数据。我已经尝试在使用
os.chdir(tempfile.gettempdir())
调用函数之前将当前工作目录更改为此临时文件夹,但这没有帮助。
还有其他想法吗?我不认为我在做一些特别不寻常的事情...
详情
我正在使用 python 3.7 和 azureml-sdk 1.9.0,我可以在本地 运行 python 脚本没有问题。我目前使用 Azure Functions 扩展版本 0.23.0(以及用于 CI/CD 的 Azure DevOps 管道)从 VSCode 进行部署。
这是我的完整堆栈跟踪:
Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.HttpTrigger_Train
---> Microsoft.Azure.WebJobs.Script.Workers.Rpc.RpcException: Result: Failure
Exception: OSError: [Errno 30] Read-only file system: '/home/site/wwwroot/.python_packages/lib/site-packages/dotnetcore2/bin/deps.lock'
Stack: File "/azure-functions-host/workers/python/3.7/LINUX/X64/azure_functions_worker/dispatcher.py", line 345, in _handle__invocation_request
self.__run_sync_func, invocation_id, fi.func, args)
File "/usr/local/lib/python3.7/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/azure-functions-host/workers/python/3.7/LINUX/X64/azure_functions_worker/dispatcher.py", line 480, in __run_sync_func
return func(**params)
File "/home/site/wwwroot/HttpTrigger_Train/__init__.py", line 11, in main
train()
File "/home/site/wwwroot/shared_code/train.py", line 70, in train
dataset = Dataset.Tabular.from_delimited_files(path = datastore_paths)
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/data/_loggerfactory.py", line 126, in wrapper
return func(*args, **kwargs)
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/data/dataset_factory.py", line 308, in from_delimited_files
quoting=support_multi_line)
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/readers.py", line 100, in read_csv
df = Dataflow._path_to_get_files_block(path, archive_options)
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/dataflow.py", line 2387, in _path_to_get_files_block
return datastore_to_dataflow(path)
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/_datastore_helper.py", line 41, in datastore_to_dataflow
datastore, datastore_value = get_datastore_value(source)
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/_datastore_helper.py", line 83, in get_datastore_value
_set_auth_type(workspace)
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/_datastore_helper.py", line 134, in _set_auth_type
get_engine_api().set_aml_auth(SetAmlAuthMessageArgument(AuthType.SERVICEPRINCIPAL, json.dumps(auth)))
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/engineapi/api.py", line 18, in get_engine_api
_engine_api = EngineAPI()
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/engineapi/api.py", line 55, in __init__
self._message_channel = launch_engine()
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/engineapi/engine.py", line 300, in launch_engine
dependencies_path = runtime.ensure_dependencies()
File "/home/site/wwwroot/.python_packages/lib/site-packages/dotnetcore2/runtime.py", line 141, in ensure_dependencies
with _FileLock(deps_lock_path, raise_on_timeout=timeout_exception):
File "/home/site/wwwroot/.python_packages/lib/site-packages/dotnetcore2/runtime.py", line 113, in __enter__
self.acquire()
File "/home/site/wwwroot/.python_packages/lib/site-packages/dotnetcore2/runtime.py", line 72, in acquire
self.lockfile = os.open(self.lockfile_path, os.O_CREAT | os.O_EXCL | os.O_RDWR)
at Microsoft.Azure.WebJobs.Script.Description.WorkerFunctionInvoker.InvokeCore(Object[] parameters, FunctionInvocationContext context) in /src/azure-functions-host/src/WebJobs.Script/Description/Workers/WorkerFunctionInvoker.cs:line 85
at Microsoft.Azure.WebJobs.Script.Description.FunctionInvokerBase.Invoke(Object[] parameters) in /src/azure-functions-host/src/WebJobs.Script/Description/FunctionInvokerBase.cs:line 85
at Microsoft.Azure.WebJobs.Script.Description.FunctionGenerator.Coerce[T](Task`1 src) in /src/azure-functions-host/src/WebJobs.Script/Description/FunctionGenerator.cs:line 225
at Microsoft.Azure.WebJobs.Host.Executors.FunctionInvoker`2.InvokeAsync(Object instance, Object[] arguments) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionInvoker.cs:line 52
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.InvokeAsync(IFunctionInvoker invoker, ParameterHelper parameterHelper, CancellationTokenSource timeoutTokenSource, CancellationTokenSource functionCancellationTokenSource, Boolean throwOnTimeout, TimeSpan timerInterval, IFunctionInstance instance) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 587
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithWatchersAsync(IFunctionInstanceEx instance, ParameterHelper parameterHelper, ILogger logger, CancellationTokenSource functionCancellationTokenSource) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 532
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, ParameterHelper parameterHelper, IFunctionOutputDefinition outputDefinition, ILogger logger, CancellationTokenSource functionCancellationTokenSource) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 470
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, FunctionStartedMessage message, FunctionInstanceLogEntry instanceLogEntry, ParameterHelper parameterHelper, ILogger logger, CancellationToken cancellationToken) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 278
--- End of inner exception stack trace ---
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, FunctionStartedMessage message, FunctionInstanceLogEntry instanceLogEntry, ParameterHelper parameterHelper, ILogger logger, CancellationToken cancellationToken) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 325
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.TryExecuteAsyncCore(IFunctionInstanceEx functionInstance, CancellationToken cancellationToken) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 117
问题是我的虚拟环境中的 OS 版本不兼容。
非常感谢 PramodValavala-MSFT 创建 docker 容器的想法!按照他的建议,我突然收到了 dataset = Dataset.Tabular.from_delimited_files(path = datastore_paths)
命令的以下错误消息:
Exception: NotImplementedError: Unsupported Linux distribution debian 10.
这让我想起了 azure 机器学习文档中的以下警告:
Some dataset classes have dependencies on the azureml-dataprep
package, which is only compatible with 64-bit Python. For Linux users,
these classes are supported only on the following distributions: Red
Hat Enterprise Linux (7, 8), Ubuntu (14.04, 16.04, 18.04), Fedora (27,
28), Debian (8, 9), and CentOS (7).
选择预定义的 docker 图像 2.0-python3.7
(运行 Debian 9) 而不是 3.0-python3.7
(运行 Debian 10) 解决了这个问题(见 https://hub.docker.com/_/microsoft-azure-functions-python).
我怀疑我最初使用的默认虚拟环境也在不兼容的 OS 上 运行。
问题
我正在尝试准备,然后从 Python 中的 Azure 函数向 Azure 机器学习提交新实验。因此,我为我的 Azure ML 工作区注册了一个新数据集,其中包含使用 dataset.register(...
的 ML 模型的训练数据。但是,当我尝试使用以下代码行创建此数据集时
dataset = Dataset.Tabular.from_delimited_files(path = datastore_paths)
然后我得到一个 Failure Exception: OSError: [Errno 30] Read-only file system ...
。
想法
- 我知道如果可能的话,我不应该从 Azure 函数中写入文件系统。但我实际上不想向本地文件系统写入任何内容。我只想创建数据集作为对
datastore_path
下我的 blob 存储的引用,然后将其注册到我的 Azure 机器学习工作区。但似乎方法from_delimited_files
无论如何都在尝试写入文件系统(也许是一些缓存?)。 - 我也知道有一个temp文件夹,里面允许写入临时文件。但是,我相信我无法真正控制此方法在何处写入数据。我已经尝试在使用
os.chdir(tempfile.gettempdir())
调用函数之前将当前工作目录更改为此临时文件夹,但这没有帮助。
还有其他想法吗?我不认为我在做一些特别不寻常的事情...
详情
我正在使用 python 3.7 和 azureml-sdk 1.9.0,我可以在本地 运行 python 脚本没有问题。我目前使用 Azure Functions 扩展版本 0.23.0(以及用于 CI/CD 的 Azure DevOps 管道)从 VSCode 进行部署。
这是我的完整堆栈跟踪:
Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.HttpTrigger_Train
---> Microsoft.Azure.WebJobs.Script.Workers.Rpc.RpcException: Result: Failure
Exception: OSError: [Errno 30] Read-only file system: '/home/site/wwwroot/.python_packages/lib/site-packages/dotnetcore2/bin/deps.lock'
Stack: File "/azure-functions-host/workers/python/3.7/LINUX/X64/azure_functions_worker/dispatcher.py", line 345, in _handle__invocation_request
self.__run_sync_func, invocation_id, fi.func, args)
File "/usr/local/lib/python3.7/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/azure-functions-host/workers/python/3.7/LINUX/X64/azure_functions_worker/dispatcher.py", line 480, in __run_sync_func
return func(**params)
File "/home/site/wwwroot/HttpTrigger_Train/__init__.py", line 11, in main
train()
File "/home/site/wwwroot/shared_code/train.py", line 70, in train
dataset = Dataset.Tabular.from_delimited_files(path = datastore_paths)
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/data/_loggerfactory.py", line 126, in wrapper
return func(*args, **kwargs)
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/data/dataset_factory.py", line 308, in from_delimited_files
quoting=support_multi_line)
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/readers.py", line 100, in read_csv
df = Dataflow._path_to_get_files_block(path, archive_options)
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/dataflow.py", line 2387, in _path_to_get_files_block
return datastore_to_dataflow(path)
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/_datastore_helper.py", line 41, in datastore_to_dataflow
datastore, datastore_value = get_datastore_value(source)
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/_datastore_helper.py", line 83, in get_datastore_value
_set_auth_type(workspace)
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/_datastore_helper.py", line 134, in _set_auth_type
get_engine_api().set_aml_auth(SetAmlAuthMessageArgument(AuthType.SERVICEPRINCIPAL, json.dumps(auth)))
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/engineapi/api.py", line 18, in get_engine_api
_engine_api = EngineAPI()
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/engineapi/api.py", line 55, in __init__
self._message_channel = launch_engine()
File "/home/site/wwwroot/.python_packages/lib/site-packages/azureml/dataprep/api/engineapi/engine.py", line 300, in launch_engine
dependencies_path = runtime.ensure_dependencies()
File "/home/site/wwwroot/.python_packages/lib/site-packages/dotnetcore2/runtime.py", line 141, in ensure_dependencies
with _FileLock(deps_lock_path, raise_on_timeout=timeout_exception):
File "/home/site/wwwroot/.python_packages/lib/site-packages/dotnetcore2/runtime.py", line 113, in __enter__
self.acquire()
File "/home/site/wwwroot/.python_packages/lib/site-packages/dotnetcore2/runtime.py", line 72, in acquire
self.lockfile = os.open(self.lockfile_path, os.O_CREAT | os.O_EXCL | os.O_RDWR)
at Microsoft.Azure.WebJobs.Script.Description.WorkerFunctionInvoker.InvokeCore(Object[] parameters, FunctionInvocationContext context) in /src/azure-functions-host/src/WebJobs.Script/Description/Workers/WorkerFunctionInvoker.cs:line 85
at Microsoft.Azure.WebJobs.Script.Description.FunctionInvokerBase.Invoke(Object[] parameters) in /src/azure-functions-host/src/WebJobs.Script/Description/FunctionInvokerBase.cs:line 85
at Microsoft.Azure.WebJobs.Script.Description.FunctionGenerator.Coerce[T](Task`1 src) in /src/azure-functions-host/src/WebJobs.Script/Description/FunctionGenerator.cs:line 225
at Microsoft.Azure.WebJobs.Host.Executors.FunctionInvoker`2.InvokeAsync(Object instance, Object[] arguments) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionInvoker.cs:line 52
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.InvokeAsync(IFunctionInvoker invoker, ParameterHelper parameterHelper, CancellationTokenSource timeoutTokenSource, CancellationTokenSource functionCancellationTokenSource, Boolean throwOnTimeout, TimeSpan timerInterval, IFunctionInstance instance) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 587
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithWatchersAsync(IFunctionInstanceEx instance, ParameterHelper parameterHelper, ILogger logger, CancellationTokenSource functionCancellationTokenSource) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 532
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, ParameterHelper parameterHelper, IFunctionOutputDefinition outputDefinition, ILogger logger, CancellationTokenSource functionCancellationTokenSource) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 470
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, FunctionStartedMessage message, FunctionInstanceLogEntry instanceLogEntry, ParameterHelper parameterHelper, ILogger logger, CancellationToken cancellationToken) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 278
--- End of inner exception stack trace ---
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, FunctionStartedMessage message, FunctionInstanceLogEntry instanceLogEntry, ParameterHelper parameterHelper, ILogger logger, CancellationToken cancellationToken) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 325
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.TryExecuteAsyncCore(IFunctionInstanceEx functionInstance, CancellationToken cancellationToken) in C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 117
问题是我的虚拟环境中的 OS 版本不兼容。
非常感谢 PramodValavala-MSFT 创建 docker 容器的想法!按照他的建议,我突然收到了 dataset = Dataset.Tabular.from_delimited_files(path = datastore_paths)
命令的以下错误消息:
Exception: NotImplementedError: Unsupported Linux distribution debian 10.
这让我想起了 azure 机器学习文档中的以下警告:
Some dataset classes have dependencies on the azureml-dataprep package, which is only compatible with 64-bit Python. For Linux users, these classes are supported only on the following distributions: Red Hat Enterprise Linux (7, 8), Ubuntu (14.04, 16.04, 18.04), Fedora (27, 28), Debian (8, 9), and CentOS (7).
选择预定义的 docker 图像 2.0-python3.7
(运行 Debian 9) 而不是 3.0-python3.7
(运行 Debian 10) 解决了这个问题(见 https://hub.docker.com/_/microsoft-azure-functions-python).
我怀疑我最初使用的默认虚拟环境也在不兼容的 OS 上 运行。