如何在 Mac M1 上连接到 docker 中的射线?
How can I connect to ray in docker on Mac M1?
post 很长主要是因为所有的错误信息。要点是:
- 我用 ray 启动了一个 docker 容器(最新标签目前有 ray 版本 1.9.2)
- 使用
docker exec
我在这个容器中启动了一个 python 进程
- 来自 python 我尝试连接到 ray
- 在 Linux
上工作时,在 M1 Mac 上尝试连接失败
➜ docker run -it rayproject/ray:latest
$ ray start --head --block --num-gpus=1
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
(base) ray@8346ae81903e:~$ ray start --head --dashboard-host 0.0.0.0 --block --include-dashboard trueLocal node IP: 172.17.0.2
2022-01-27 01:22:01,109 INFO services.py:1340 -- View the Ray dashboard at http://172.17.0.2:8265
2022-01-27 01:22:01,119 WARNING services.py:1826 -- WARNING: The object store is using /tmp instead of /dev/shm because /dev/shm has only 67108864 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing '--shm-size=1.78gb' to 'docker run' (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 30% of available RAM.
[mutex.cc : 926] RAW: pthread_getschedparam failed: 1
--------------------
Ray runtime started.
--------------------
Next steps
To connect to this Ray runtime from another node, run
ray start --address='172.17.0.2:6379' --redis-password='XXXXX'
Alternatively, use the following Python code:
import ray
ray.init(address='auto', _redis_password='XXXXX')
To connect to this Ray runtime from outside of the cluster, for example to
connect to a remote cluster from your laptop directly, use the following
Python code:
import ray
ray.init(address='ray://<head_node_ip_address>:10001')
...
然后我使用docker exec -it ... bash
连接到容器,运行 python
repl并尝试使用之前的ray输出建议的命令。
import ray
ray.init(address='auto', _redis_password='XXXXX')
结果
Traceback (most recent call last):
File "", line 1, in
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/worker.py", line 834, in init
redis_address, _, _ = services.validate_redis_address(address)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/services.py", line 375, in validate_redis_address
address = find_redis_address_or_die()
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/services.py", line 287, in find_redis_address_or_die
"Could not find any running Ray instance. "
ConnectionError: Could not find any running Ray instance. Please specify the one to connect to by setting address
.
尝试通过特定地址连接也没有成功。
ray.init(address='ray://localhost:10001')
[mutex.cc : 926] RAW: pthread_getschedparam failed: 1
Traceback (most recent call last):
File "", line 1, in
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/worker.py", line 775, in init
return builder.connect()
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/client_builder.py", line 155, in connect
ray_init_kwargs=self._remote_init_kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client_connect.py", line 42, in connect
ray_init_kwargs=ray_init_kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/init.py", line 228, in connect
conn = self.get_context().connect(*args, **kw_args)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/init.py", line 88, in connect
self.client_worker._server_init(job_config, ray_init_kwargs)
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/worker.py", line 698, in _server_init
f"Initialization failure from server:\n{response.msg}")
ConnectionAbortedError: Initialization failure from server:
Traceback (most recent call last):
File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/server/proxier.py", line 629, in Datapath
"Starting Ray client server failed. See "
RuntimeError: Starting Ray client server failed. See ray_client_server_23000.err for detailed logs.
我遇到了同样的问题,通过将 dashboard_host
设置为 0.0.0.0
解决了这个问题。
ray.init(dashboard_host="0.0.0.0",dashboard_port=6379)
post 很长主要是因为所有的错误信息。要点是:
- 我用 ray 启动了一个 docker 容器(最新标签目前有 ray 版本 1.9.2)
- 使用
docker exec
我在这个容器中启动了一个 python 进程 - 来自 python 我尝试连接到 ray
- 在 Linux 上工作时,在 M1 Mac 上尝试连接失败
➜ docker run -it rayproject/ray:latest
$ ray start --head --block --num-gpus=1
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
(base) ray@8346ae81903e:~$ ray start --head --dashboard-host 0.0.0.0 --block --include-dashboard trueLocal node IP: 172.17.0.2
2022-01-27 01:22:01,109 INFO services.py:1340 -- View the Ray dashboard at http://172.17.0.2:8265
2022-01-27 01:22:01,119 WARNING services.py:1826 -- WARNING: The object store is using /tmp instead of /dev/shm because /dev/shm has only 67108864 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing '--shm-size=1.78gb' to 'docker run' (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 30% of available RAM.
[mutex.cc : 926] RAW: pthread_getschedparam failed: 1
--------------------
Ray runtime started.
--------------------
Next steps
To connect to this Ray runtime from another node, run
ray start --address='172.17.0.2:6379' --redis-password='XXXXX'
Alternatively, use the following Python code:
import ray
ray.init(address='auto', _redis_password='XXXXX')
To connect to this Ray runtime from outside of the cluster, for example to
connect to a remote cluster from your laptop directly, use the following
Python code:
import ray
ray.init(address='ray://<head_node_ip_address>:10001')
...
然后我使用docker exec -it ... bash
连接到容器,运行 python
repl并尝试使用之前的ray输出建议的命令。
import ray
ray.init(address='auto', _redis_password='XXXXX')
结果
Traceback (most recent call last): File "", line 1, in File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper return func(*args, **kwargs) File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/worker.py", line 834, in init redis_address, _, _ = services.validate_redis_address(address) File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/services.py", line 375, in validate_redis_address address = find_redis_address_or_die() File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/services.py", line 287, in find_redis_address_or_die "Could not find any running Ray instance. " ConnectionError: Could not find any running Ray instance. Please specify the one to connect to by setting
address
.
尝试通过特定地址连接也没有成功。
ray.init(address='ray://localhost:10001')
[mutex.cc : 926] RAW: pthread_getschedparam failed: 1 Traceback (most recent call last): File "", line 1, in File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper return func(*args, **kwargs) File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/worker.py", line 775, in init return builder.connect() File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/client_builder.py", line 155, in connect ray_init_kwargs=self._remote_init_kwargs) File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client_connect.py", line 42, in connect ray_init_kwargs=ray_init_kwargs) File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/init.py", line 228, in connect conn = self.get_context().connect(*args, **kw_args) File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/init.py", line 88, in connect self.client_worker._server_init(job_config, ray_init_kwargs) File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/worker.py", line 698, in _server_init f"Initialization failure from server:\n{response.msg}") ConnectionAbortedError: Initialization failure from server: Traceback (most recent call last): File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/server/proxier.py", line 629, in Datapath "Starting Ray client server failed. See " RuntimeError: Starting Ray client server failed. See ray_client_server_23000.err for detailed logs.
我遇到了同样的问题,通过将 dashboard_host
设置为 0.0.0.0
解决了这个问题。
ray.init(dashboard_host="0.0.0.0",dashboard_port=6379)