带有 Synapse ML 预测的 Azure Synapste 预测模型
Azure Synapste Predict Model with Synapse ML predict
但是当我执行时:
#Bind model within Spark session
model = pcontext.bind_model(
return_types=RETURN_TYPES,
runtime=RUNTIME,
model_alias="Sales", #This alias will be used in PREDICT call to refer this model
model_uri=AML_MODEL_URI, #In case of AML, it will be AML_MODEL_URI
aml_workspace=ws #This is only for AML. In case of ADLS, this parameter can be removed
).register()
我有:
NotADirectoryError: [Errno 20] Not a directory: '/mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1648328086462_0002/spark-3d802a7e-15b7-4eb6-88c5-f0e01f8cdb35/userFiles-fbe23a43-67d3-4e65-a879-4a497e804b40/68603955220f5f8646700d809b71be9949011a2476a34965a3d5c0f3d14de79b.pkl/MLmodel'
Traceback (most recent call last):
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/core/_context.py", line 47, in bind_model
udf = _create_udf(
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/core/_udf.py", line 104, in _create_udf
model_runtime = runtime_gen._create_runtime()
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/core/_runtime.py", line 103, in _create_runtime
if self._check_model_runtime_compatibility(model_runtime):
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/core/_runtime.py", line 166, in _check_model_runtime_compatibility
model_wrapper = self._load()
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/core/_runtime.py", line 78, in _load
return SynapsePredictModelCache._get_or_load(
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/core/_cache.py", line 172, in _get_or_load
model = load_model(runtime, model_uri, functions)
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/utils/_model_loader.py", line 257, in load_model
model = loader.load(model_uri, functions)
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/utils/_model_loader.py", line 122, in load
model = self._load(model_uri)
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/utils/_model_loader.py", line 215, in _load
return self._load_mlflow(model_uri)
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/utils/_model_loader.py", line 59, in _load_mlflow
model = mlflow.pyfunc.load_model(model_uri)
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/mlflow/pyfunc/
init
.py", line 640, in load_model
model_meta = Model.load(os.path.join(local_path, MLMODEL_FILE_NAME))
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/mlflow/models/model.py", line 124, in load
with open(path) as f:
NotADirectoryError: [Errno 20] Not a directory: '/mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1648328086462_0002/spark-3d802a7e-15b7-4eb6-88c5-f0e01f8cdb35/userFiles-fbe23a43-67d3-4e65-a879-4a497e804b40/68603955220f5f8646700d809b71be9949011a2476a34965a3d5c0f3d14de79b.pkl/MLmodel'
我该如何修复该错误?
(UPDATE:29/3/2022): You will experiencing this error message if you model does not contains all the required files in the ML model.
根据重现,我创建了两个名为:
的 ML 模型
sklearn_regression_model: Which contains only sklearn_regression_model.pkl
file.
When I predict for MLFLOW packaged model named sklearn_regression_model
, getting same error as shown above:
linear_regression: Which contains the below files:
When I predict for MLFLOW packaged model named linear_regression
, it works as excepted.
It should be AML_MODEL_URI = "" #In URI ":x" => Rossman_Sales:2
Before running this script, update it with the URI for ADLS Gen2 data file along with model output return data type and ADLS/AML URI for the model file.
#Set model URI
#Set AML URI, if trained model is registered in AML
AML_MODEL_URI = "<aml model uri>" #In URI ":x" signifies model version in AML. You can choose which model version you want to run. If ":x" is not provided then by default latest version will be picked.
#Set ADLS URI, if trained model is uploaded in ADLS
ADLS_MODEL_URI = "abfss://<filesystemname>@<account name>.dfs.core.windows.net/<model mlflow folder path>"
来自 AML 工作区的模型 URI:
DATA_FILE = "abfss://data@cheprasynapse.dfs.core.windows.net/AML/LengthOfStay_cooked_small.csv"
AML_MODEL_URI_SKLEARN = "aml://mlflow_sklearn:1" #Here ":1" signifies model version in AML. We can choose which version we want to run. If ":1" is not provided then by default latest version will be picked
RETURN_TYPES = "INT"
RUNTIME = "mlflow"
上传到 ADLS Gen2 的模型 URI:
DATA_FILE = "abfss://data@cheprasynapse.dfs.core.windows.net/AML/LengthOfStay_cooked_small.csv"
AML_MODEL_URI_SKLEARN = "abfss://data@cheprasynapse.dfs.core.windows.net/linear_regression/linear_regression" #Here ":1" signifies model version in AML. We can choose which version we want to run. If ":1" is not provided then by default latest version will be picked
RETURN_TYPES = "INT"
RUNTIME = "mlflow"
但是当我执行时:
#Bind model within Spark session
model = pcontext.bind_model(
return_types=RETURN_TYPES,
runtime=RUNTIME,
model_alias="Sales", #This alias will be used in PREDICT call to refer this model
model_uri=AML_MODEL_URI, #In case of AML, it will be AML_MODEL_URI
aml_workspace=ws #This is only for AML. In case of ADLS, this parameter can be removed
).register()
我有:
NotADirectoryError: [Errno 20] Not a directory: '/mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1648328086462_0002/spark-3d802a7e-15b7-4eb6-88c5-f0e01f8cdb35/userFiles-fbe23a43-67d3-4e65-a879-4a497e804b40/68603955220f5f8646700d809b71be9949011a2476a34965a3d5c0f3d14de79b.pkl/MLmodel'
Traceback (most recent call last):
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/core/_context.py", line 47, in bind_model
udf = _create_udf(
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/core/_udf.py", line 104, in _create_udf
model_runtime = runtime_gen._create_runtime()
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/core/_runtime.py", line 103, in _create_runtime
if self._check_model_runtime_compatibility(model_runtime):
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/core/_runtime.py", line 166, in _check_model_runtime_compatibility
model_wrapper = self._load()
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/core/_runtime.py", line 78, in _load
return SynapsePredictModelCache._get_or_load(
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/core/_cache.py", line 172, in _get_or_load
model = load_model(runtime, model_uri, functions)
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/utils/_model_loader.py", line 257, in load_model
model = loader.load(model_uri, functions)
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/utils/_model_loader.py", line 122, in load
model = self._load(model_uri)
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/utils/_model_loader.py", line 215, in _load
return self._load_mlflow(model_uri)
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/azure/synapse/ml/predict/utils/_model_loader.py", line 59, in _load_mlflow
model = mlflow.pyfunc.load_model(model_uri)
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/mlflow/pyfunc/
init
.py", line 640, in load_model
model_meta = Model.load(os.path.join(local_path, MLMODEL_FILE_NAME))
File "/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/mlflow/models/model.py", line 124, in load
with open(path) as f:
NotADirectoryError: [Errno 20] Not a directory: '/mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1648328086462_0002/spark-3d802a7e-15b7-4eb6-88c5-f0e01f8cdb35/userFiles-fbe23a43-67d3-4e65-a879-4a497e804b40/68603955220f5f8646700d809b71be9949011a2476a34965a3d5c0f3d14de79b.pkl/MLmodel'
我该如何修复该错误?
(UPDATE:29/3/2022): You will experiencing this error message if you model does not contains all the required files in the ML model.
根据重现,我创建了两个名为:
的 ML 模型sklearn_regression_model: Which contains only
sklearn_regression_model.pkl
file.
When I predict for MLFLOW packaged model named
sklearn_regression_model
, getting same error as shown above:
linear_regression: Which contains the below files:
When I predict for MLFLOW packaged model named
linear_regression
, it works as excepted.
It should be AML_MODEL_URI = "" #In URI ":x" =>
Rossman_Sales:2
Before running this script, update it with the URI for ADLS Gen2 data file along with model output return data type and ADLS/AML URI for the model file.
#Set model URI
#Set AML URI, if trained model is registered in AML
AML_MODEL_URI = "<aml model uri>" #In URI ":x" signifies model version in AML. You can choose which model version you want to run. If ":x" is not provided then by default latest version will be picked.
#Set ADLS URI, if trained model is uploaded in ADLS
ADLS_MODEL_URI = "abfss://<filesystemname>@<account name>.dfs.core.windows.net/<model mlflow folder path>"
来自 AML 工作区的模型 URI:
DATA_FILE = "abfss://data@cheprasynapse.dfs.core.windows.net/AML/LengthOfStay_cooked_small.csv"
AML_MODEL_URI_SKLEARN = "aml://mlflow_sklearn:1" #Here ":1" signifies model version in AML. We can choose which version we want to run. If ":1" is not provided then by default latest version will be picked
RETURN_TYPES = "INT"
RUNTIME = "mlflow"
上传到 ADLS Gen2 的模型 URI:
DATA_FILE = "abfss://data@cheprasynapse.dfs.core.windows.net/AML/LengthOfStay_cooked_small.csv"
AML_MODEL_URI_SKLEARN = "abfss://data@cheprasynapse.dfs.core.windows.net/linear_regression/linear_regression" #Here ":1" signifies model version in AML. We can choose which version we want to run. If ":1" is not provided then by default latest version will be picked
RETURN_TYPES = "INT"
RUNTIME = "mlflow"