调试 AML 模型部署

Question

我在 python 中有一个 ML 模型（在本地训练）。之前该模型已部署到 Windows IIS 服务器并且运行良好。

现在，我正在尝试将其部署为具有 1 个核心和 1 GB 内存的 Azure 容器实例 (ACI) 上的服务。我参考了 one and two Microsoft 文档。文档的所有步骤都使用 SDK，但我使用的是 Azure 门户中的 GUI 功能。

注册模型后，我创建了一个入口脚本和一个conda环境YAML文件（见下文），并将两者上传到“自定义部署资产”（在部署模型区域）。

不幸的是，点击部署后，Deployment 状态停留在 Transitioning 状态。即使在 4 小时后，状态保持不变，也没有部署日志，所以我无法在这里找到我做错了什么。

NOTE: below is just an excerpt of the entry script

import pandas as pd
import pickle
import re, json
import numpy as np
import sklearn

def init():
    global model 
    global classes
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'randomForest50.pkl')
    model = pickle.load(open(model_path, "rb"))

    classes = lambda x : ["F", "M"][x]

def run(data):
    try:
        namesList = json.loads(data)["data"]["names"]
        pred = list(map(classes, model.predict(preprocessing(namesList))))
        return str(pred[0])
    except Exception as e:
        error = str(e)
        return error

name: gender_prediction
dependencies:
- python
- numpy
- scikit-learn
- pip:
    - pandas
    - pickle
    - re
    - json

Answer 1

问题出在 YAML 文件中。 YAML中的dependencies/libraries应该根据conda环境。所以，我相应地改变了一切，它奏效了。

修改后的 YAML 文件：

name: gender_prediction
dependencies:
- python=3.7
- numpy
- scikit-learn
- pip:
    - azureml-defaults
    - pandas
    - pickle4
    - regex
    - inference-schema[numpy-support]

调试 AML 模型部署

Debugging AML Model Deployment

yaml

machine-learning

azure-machine-learning-studio

azure-machine-learning-service