Docker-compose 无法在 Sagemaker 的笔记本实例上启动

Docker-compose wouldn't start on Sagemaker's Notebook instance

Docker-compose 似乎已停止在 Sagemaker Notebook 实例上工作。 运行 docker-compose up 时遇到如下错误:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/bin/docker-compose", line 8, in <module>
    sys.exit(main())
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/compose/cli/main.py", line 81, in main
    command_func()
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/compose/cli/main.py", line 200, in perform_command
    project = project_from_options('.', options)
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/compose/cli/command.py", line 70, in project_from_options
    enabled_profiles=get_profiles_from_options(options, environment)
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/compose/cli/command.py", line 153, in get_project
    verbose=verbose, version=api_version, context=context, environment=environment
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/compose/cli/docker_client.py", line 43, in get_client
    environment=environment, tls_version=get_tls_version(environment)
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/compose/cli/docker_client.py", line 170, in docker_client
    client = APIClient(use_ssh_client=not use_paramiko_ssh, **kwargs)
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/docker/api/client.py", line 197, in __init__
    self._version = self._retrieve_server_version()
  File "/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/docker/api/client.py", line 222, in _retrieve_server_version
    'Error while fetching server API version: {0}'.format(e)
docker.errors.DockerException: Error while fetching server API version: Timeout value connect was Timeout(connect=60, read=60, total=None), but it must be an int, float or None

我可以像往常一样启动 Docker 个容器。

sh-4.2$ docker version
Client:
 Version:           20.10.7
 API version:       1.41
 Go version:        go1.15.14
 Git commit:        f0df350
 Built:             Tue Sep 28 19:55:40 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.7
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.15.14
  Git commit:       b0f5bc3
  Built:            Tue Sep 28 19:57:35 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.6
  GitCommit:        d71fcd7d8303cbf684402823e425e9dd2e99285d
 runc:
  Version:          1.0.0
  GitCommit:        %runc_commit
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

但是docker-compose不行...

sh-4.2$ docker-compose version
docker-compose version 1.29.2, build unknown
docker-py version: 5.0.0
CPython version: 3.6.13
OpenSSL version: OpenSSL 1.1.1l  24 Aug 2021

对于那些(可能)遇到同样问题的人,这里是解决方法:

1).安装最新版本的 docker-compose:

sh-4.2$ sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sh-4.2$ sudo chmod +x /usr/local/bin/docker-compose

2).相应地更改您的 PATH(因为 docker-compose 是使用 conda 安装的并且首先被拾取)或使用 /usr/local/bin/docker-compose 之后:

sh-4.2$ PATH=/usr/local/bin:$PATH
sh-4.2$ docker-compose version
docker-compose version 1.29.2, build 5becea4c
docker-py version: 5.0.0
CPython version: 3.7.10
OpenSSL version: OpenSSL 1.1.0l  10 Sep 2019

问题可能与此有关:

On August 9, 2021 the Jupyter Notebook and Jupyter Lab open source software projects announced 2 security concerns that could impact Amazon Sagemaker Notebook Instance customers.

Sagemaker has deployed updates to address these concerns, and we recommend customers with existing notebook sessions to stop and restart their notebook instance(s) to benefit from these updates. Notebook instances launched after August 10, 2021, when updates were deployed, are not impacted by this issue and do not need to be restarted.