Azure ML:如何访问失败模型部署的日志
Azure ML: how to access logs of a failed Model deployment
我正在部署一个失败的 Keras 模型,并出现以下错误。异常表明我可以通过 运行 "print(service.get_logs())" 检索日志,但这给了我空的结果。我正在从我的 AzureNotebook 部署模型,并使用相同的 "service" var 来检索日志。
此外,如何从容器实例中检索日志?我正在部署到我创建的 AKS 计算集群。遗憾的是,异常中的文档 link 也没有详细说明如何检索这些日志。
More information can be found using '.get_logs()' Error:
{ "code":
"KubernetesDeploymentFailed", "statusCode": 400, "message":
"Kubernetes Deployment failed", "details": [
{
"code": "CrashLoopBackOff",
"message": "Your container application crashed. This may be caused by errors in your scoring file's init() function.\nPlease check
the logs for your container instance: my-model-service. From
the AML SDK, you can run print(service.get_logs()) if you have service
object to fetch the logs. \nYou can also try to run image
mlwks.azurecr.io/azureml/azureml_3c0c34b65cf18c8644e8d745943ab7d2:latest
locally. Please refer to http://aka.ms/debugimage#service-launch-fails
for more information."
} ] }
更新
这是我部署模型的代码:
environment = Environment('my-environment')
environment.python.conda_dependencies = CondaDependencies.create(pip_packages=["azureml-defaults","azureml-dataprep[pandas,fuse]","tensorflow", "keras", "matplotlib"])
service_name = 'my-model-service'
# Remove any existing service under the same name.
try:
Webservice(ws, service_name).delete()
except WebserviceException:
pass
inference_config = InferenceConfig(entry_script='score.py', environment=environment)
comp = ComputeTarget(workspace=ws, name="ml-inference-dev")
service = Model.deploy(workspace=ws,
name=service_name,
models=[model],
inference_config=inference_config,
deployment_target=comp
)
service.wait_for_deployment(show_output=True)
还有我的score.py
import joblib
import numpy as np
import os
import keras
from keras.models import load_model
from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
def init():
global model
model_path = Model.get_model_path('model.h5')
model = load_model(model_path)
model = keras.models.load_model(model_path)
# The run() method is called each time a request is made to the scoring API.
#
# Shown here are the optional input_schema and output_schema decorators
# from the inference-schema pip package. Using these decorators on your
# run() method parses and validates the incoming payload against
# the example input you provide here. This will also generate a Swagger
# API document for your web service.
@input_schema('data', NumpyParameterType(np.array([[0.1, 1.2, 2.3, 3.4, 4.5, 5.6, 6.7, 7.8, 8.9, 9.0]])))
@output_schema(NumpyParameterType(np.array([4429.929236457418])))
def run(data):
return [123] #test
更新二:
这是端点页面的屏幕截图。 CPU 为 .1 是否正常?另外,当我在浏览器中点击 swagger url 时,出现错误:"No ready replicas for service doc-classify-env-service"
更新 3
最终进入容器日志后,结果发现我的 score.py
上出现了这个错误
ModuleNotFoundError: No module named 'inference_schema'
然后我 运行 一个测试注释掉了 "input_schema" 和 "output_schema" 的引用并且还简化了我的 pip_packages 和 REST 端点出现!我还能够从模型中得到预测。
pip_packages=["azureml-defaults","tensorflow", "keras"])
所以我的问题是,我应该如何让 pip_packages 的评分文件使用 inference_schema 装饰器?我假设我需要包括 azureml-sdk[auotml] pip 包,但是当我这样做时,图像创建失败并且我看到几个依赖冲突。
尝试直接从工作区检索您的服务
ws.webservices[service_name].get_logs()
此外,我发现将图像部署为端点比推理+部署模型更容易(取决于您的用例)
my_image = Image(ws, name='test', version='26')
service = AksWebservice.deploy_from_image(ws, "test1", my_image, deployment_config, aks_target)
我正在部署一个失败的 Keras 模型,并出现以下错误。异常表明我可以通过 运行 "print(service.get_logs())" 检索日志,但这给了我空的结果。我正在从我的 AzureNotebook 部署模型,并使用相同的 "service" var 来检索日志。
此外,如何从容器实例中检索日志?我正在部署到我创建的 AKS 计算集群。遗憾的是,异常中的文档 link 也没有详细说明如何检索这些日志。
More information can be found using '.get_logs()' Error:
{ "code":
"KubernetesDeploymentFailed", "statusCode": 400, "message":
"Kubernetes Deployment failed", "details": [
{
"code": "CrashLoopBackOff",
"message": "Your container application crashed. This may be caused by errors in your scoring file's init() function.\nPlease check
the logs for your container instance: my-model-service. From
the AML SDK, you can run print(service.get_logs()) if you have service
object to fetch the logs. \nYou can also try to run image
mlwks.azurecr.io/azureml/azureml_3c0c34b65cf18c8644e8d745943ab7d2:latest
locally. Please refer to http://aka.ms/debugimage#service-launch-fails
for more information."
} ] }
更新
这是我部署模型的代码:
environment = Environment('my-environment')
environment.python.conda_dependencies = CondaDependencies.create(pip_packages=["azureml-defaults","azureml-dataprep[pandas,fuse]","tensorflow", "keras", "matplotlib"])
service_name = 'my-model-service'
# Remove any existing service under the same name.
try:
Webservice(ws, service_name).delete()
except WebserviceException:
pass
inference_config = InferenceConfig(entry_script='score.py', environment=environment)
comp = ComputeTarget(workspace=ws, name="ml-inference-dev")
service = Model.deploy(workspace=ws,
name=service_name,
models=[model],
inference_config=inference_config,
deployment_target=comp
)
service.wait_for_deployment(show_output=True)
还有我的score.py
import joblib
import numpy as np
import os
import keras
from keras.models import load_model
from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
def init():
global model
model_path = Model.get_model_path('model.h5')
model = load_model(model_path)
model = keras.models.load_model(model_path)
# The run() method is called each time a request is made to the scoring API.
#
# Shown here are the optional input_schema and output_schema decorators
# from the inference-schema pip package. Using these decorators on your
# run() method parses and validates the incoming payload against
# the example input you provide here. This will also generate a Swagger
# API document for your web service.
@input_schema('data', NumpyParameterType(np.array([[0.1, 1.2, 2.3, 3.4, 4.5, 5.6, 6.7, 7.8, 8.9, 9.0]])))
@output_schema(NumpyParameterType(np.array([4429.929236457418])))
def run(data):
return [123] #test
更新二:
这是端点页面的屏幕截图。 CPU 为 .1 是否正常?另外,当我在浏览器中点击 swagger url 时,出现错误:"No ready replicas for service doc-classify-env-service"
更新 3 最终进入容器日志后,结果发现我的 score.py
上出现了这个错误ModuleNotFoundError: No module named 'inference_schema'
然后我 运行 一个测试注释掉了 "input_schema" 和 "output_schema" 的引用并且还简化了我的 pip_packages 和 REST 端点出现!我还能够从模型中得到预测。
pip_packages=["azureml-defaults","tensorflow", "keras"])
所以我的问题是,我应该如何让 pip_packages 的评分文件使用 inference_schema 装饰器?我假设我需要包括 azureml-sdk[auotml] pip 包,但是当我这样做时,图像创建失败并且我看到几个依赖冲突。
尝试直接从工作区检索您的服务
ws.webservices[service_name].get_logs()
此外,我发现将图像部署为端点比推理+部署模型更容易(取决于您的用例)
my_image = Image(ws, name='test', version='26')
service = AksWebservice.deploy_from_image(ws, "test1", my_image, deployment_config, aks_target)