如何在 SageMaker 上运行和部署 AWS 的 XGBoost MNIST 示例笔记本？

Question

我正在尝试使用 Kubernetes SageMaker Operations with the XGBoost MNIST AWS's example。

在启用 Kubernetes SageMaker Ops 之前，我已经通过 SageMaker WebUI 本身部署了 XGBoost MNIST 示例，并尝试通过 awscli 访问端点：

$ aws sagemaker-runtime invoke-endpoint \
    --region eu-west-1 \
    --endpoint-name DEMO-XGBoostEndpoint-2020-11-20-06-26-30 \
    --body $(seq 784 | xargs echo | sed 's/ /,/g') \
    >(cat) \
    --content-type text/csv > /dev/null

但是，我遇到了以下解码错误：

An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (415) from model with message "Loading csv data failed with Exception, please ensure data is in csv format:
 <class 'UnicodeDecodeError'>
 'utf-8' codec can't decode byte 0xd7 in position 0: invalid continuation byte". See https://xxxx.console.aws.amazon.com/cloudwatch/home?region=xxxxx#logEventViewer:group=/aws/sagemaker/Endpoints/DEMO-XGBoostEndpoint-2020-11-20-06-26-30 in account XXX for more information.

在日志中我可以看到：

Traceback (most recent call last):
  File "/miniconda3/lib/python3.6/site-packages/sagemaker_xgboost_container/algorithm_mode/serve_utils.py", line 102, in parse_content_data
    decoded_payload = payload.strip().decode("utf-8")
Traceback (most recent call last): File "/miniconda3/lib/python3.6/site-packages/sagemaker_xgboost_container/algorithm_mode/serve_utils.py", line 102, in parse_content_data decoded_payload = payload.strip().decode("utf-8")

当我查看 sagemaker_xgboost_container 的源代码时，我可以看到他们需要 UTF-8 格式：

        decoded_payload = payload.strip().decode("utf-8")

我的 locale 看起来不错，我真的不确定还有什么问题：

$ locale
LANG=C.UTF-8
LANGUAGE=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_PAPER="C.UTF-8"
LC_NAME="C.UTF-8"
LC_ADDRESS="C.UTF-8"
LC_TELEPHONE="C.UTF-8"
LC_MEASUREMENT="C.UTF-8"
LC_IDENTIFICATION="C.UTF-8"
LC_ALL=

Answer 1

我已经联系了 AWS 支持，显然这是 awscli 中的错误。这是他们冗长而详细的回答的修订摘录。

This is an encoding issue with AWSCLI v2. For now, you can proceed with AWSCLI v1.18 as a temporary solution.

我还验证了它适用于 aws-cli/1.18.185:

$ aws --version
aws-cli/1.18.185 Python/3.8.3 Linux/4.19.104-microsoft-standard botocore/1.19.25

$ aws sagemaker-runtime invoke-endpoint \
>     --region eu-west-1 \
>     --endpoint-name DEMO-XGBoostEndpoint-2020-11-20-06-26-30 \
>     --body $(seq 784 | xargs echo | sed 's/ /,/g') \
>     >(cat) \
>     --content-type text/csv > /dev/null
8.0%

在 AWS cli v2.1.21 中，亚马逊添加了 --cli-binary-format raw-in-base64-out 选项，这应该与 AWS cli v1.18 一起工作：

aws sagemaker-runtime invoke-endpoint \
>     --region <aws-region> \
>     --endpoint-name <you-endpoint-name> \
>     --cli-binary-format raw-in-base64-out \
>     --body $(seq 784 | xargs echo | sed 's/ /,/g') \
>     >(cat) \
>     --content-type text/csv > /dev/null

如何在 SageMaker 上运行和部署 AWS 的 XGBoost MNIST 示例笔记本？

How to run and deploy AWS's XGBoost MNIST sample notebook on SageMaker?

python

encoding

amazon-web-services

amazon-sagemaker

如何在 SageMaker 上 运行 和部署 AWS 的 XGBoost MNIST 示例笔记本？

How to run and deploy AWS's XGBoost MNIST sample notebook on SageMaker?

python

encoding

amazon-web-services

amazon-sagemaker

如何在 SageMaker 上运行和部署 AWS 的 XGBoost MNIST 示例笔记本？