为 运行 升级 Python 版本并为 Sagemaker Endpoint 创建自定义容器

Upgrading Python version for running and creating custom container for Sagemaker Endpoint

[更新]我们目前正致力于创建一个 Multi-Arm Bandit 模型以使用可在此处找到的“构建你自己的”工作流程(基本上用我们自己的模型替换模型)进行注册优化:

https://github.com/aws/amazon-sagemaker-examples/tree/master/advanced_functionality/scikit_bring_your_own

我们的项目目录设置为: Project Directory

问题是我添加了一些代码,包括仅在 Python 3.7 之后可用的数据类库,而我们的项目似乎继续使用 3.6,导致 运行 Cloud Formation 时失败设置。我们的 Cloudwatch 日志中的错误是:

2021-03-31T11:04:11.077-05:00 Copy
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/arbiter.py", line 589, in spawn_worker
    worker.init_process()
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/workers/base.py", line 134, in init_process
    self.load_wsgi()
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/workers/base.py", line 146, in load_wsgi
    self.wsgi = self.app.wsgi()
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/app/base.py", line 67, in wsgi
    self.callable = self.load()
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/app/wsgiapp.py", line 58, in load
    return self.load_wsgiapp()
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
    return util.import_app(self.app_uri)
  File "/usr/local/lib/python3.6/dist-packages/gunicorn/util.py", line 359, in import_app
    mod = importlib.import_module(module)
  File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/opt/program/wsgi.py", line 1, in <module>
    import predictor as myapp
  File "/opt/program/predictor.py", line 9, in <module>
    from model_contents.model import MultiArmBandit, BanditParameters
  File "/opt/program/model_contents/model.py", line 7, in <module>
    from dataclasses import dataclass, field, asdict
Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/gunicorn/arbiter.py", line 589, in spawn_worker worker.init_process() File "/usr/local/lib/python3.6/dist-packages/gunicorn/workers/base.py", line 134, in init_process self.load_wsgi() File "/usr/local/lib/python3.6/dist-packages/gunicorn/workers/base.py", line 146, in load_wsgi self.wsgi = self.app.wsgi() File "/usr/local/lib/python3.6/dist-packages/gunicorn/app/base.py", line 67, in wsgi self.callable = self.load() File "/usr/local/lib/python3.6/dist-packages/gunicorn/app/wsgiapp.py", line 58, in load return self.load_wsgiapp() File "/usr/local/lib/python3.6/dist-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp return util.import_app(self.app_uri) File "/usr/local/lib/python3.6/dist-packages/gunicorn/util.py", line 359, in import_app mod = importlib.import_module(module) File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 994, in _gcd_import File "<frozen importlib._bootstrap>", line 971, in _find_and_load File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 665, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 678, in exec_module File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "/opt/program/wsgi.py", line 1, in <module> import predictor as myapp File "/opt/program/predictor.py", line 9, in <module> from model_contents.model import MultiArmBandit, BanditParameters File "/opt/program/model_contents/model.py", line 7, in <module> from dataclasses import dataclass, field, asdict

    2021-03-31T11:04:11.077-05:00

Copy
ModuleNotFoundError: No module named 'dataclasses'
ModuleNotFoundError: No module named 'dataclasses'

我们更新的 Dockerfile 是:

# This is a Python 3 image that uses the nginx, gunicorn, flask stack
# for serving inferences in a stable way.

FROM ubuntu:18.04

# Retrieves information about what packages can be installed
RUN apt-get -y update && apt-get install -y --no-install-recommends \
         wget \
         python3-pip \
         python3.8 \
         python3-setuptools \
         nginx \
         ca-certificates \
    && rm -rf /var/lib/apt/lists/*

# Set python 3.8 as default
RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.8 1
RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8 1

# Here we get all python packages.
RUN pip --no-cache-dir install numpy boto3 flask gunicorn

# Set some environment variables. PYTHONUNBUFFERED keeps Python from buffering our standard
# model_output stream, which means that logs can be delivered to the user quickly. PYTHONDONTWRITEBYTECODE
# keeps Python from writing the .pyc files which are unnecessary in this case. We also update
# PATH so that the train and serve programs are found when the container is invoked.

ENV PYTHONUNBUFFERED=TRUE
ENV PYTHONDONTWRITEBYTECODE=TRUE
ENV PATH="/opt/program:${PATH}"
ENV PYTHONPATH /model_contents

# Set up the program in the image
COPY bandit/ /opt/program/
WORKDIR /opt/program/

RUN chmod +x /opt/program/serve && chmod +x /opt/program/train
LABEL git_tag=$GIT_TAG

我不确定 nginx.conf 文件是否默认为 Py 3.6,所以我想确保在没有太多更改的情况下升级到 Py 3.7 或 3.8 不是一件大事。

您可以在安装 Python3.8 后使用 apt-get 和以下 RUN 命令更新 Dockerfile

RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.8 1
RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8 1

第一个 RUN 命令将 link /usr/bin/python/usr/bin/python3.8,第二个 link /usr/bin/python3/usr/bin/python3.8