如何在 Mac M1 上连接到 docker 中的射线?

How can I connect to ray in docker on Mac M1?

post 很长主要是因为所有的错误信息。要点是:

  1. 我用 ray 启动了一个 docker 容器(最新标签目前有 ray 版本 1.9.2)
  2. 使用 docker exec 我在这个容器中启动了一个 python 进程
  3. 来自 python 我尝试连接到 ray
  4. 在 Linux
  5. 上工作时,在 M1 Mac 上尝试连接失败
➜ docker run -it rayproject/ray:latest
$ ray start --head --block --num-gpus=1
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
(base) ray@8346ae81903e:~$ ray start --head --dashboard-host 0.0.0.0 --block --include-dashboard trueLocal node IP: 172.17.0.2
2022-01-27 01:22:01,109 INFO services.py:1340 -- View the Ray dashboard at http://172.17.0.2:8265
2022-01-27 01:22:01,119 WARNING services.py:1826 -- WARNING: The object store is using /tmp instead of /dev/shm because /dev/shm has only 67108864 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing '--shm-size=1.78gb' to 'docker run' (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 30% of available RAM.
[mutex.cc : 926] RAW: pthread_getschedparam failed: 1

--------------------
Ray runtime started.
--------------------

Next steps
  To connect to this Ray runtime from another node, run
    ray start --address='172.17.0.2:6379' --redis-password='XXXXX'
  
  Alternatively, use the following Python code:
    import ray
    ray.init(address='auto', _redis_password='XXXXX')
  
  To connect to this Ray runtime from outside of the cluster, for example to
  connect to a remote cluster from your laptop directly, use the following
  Python code:
    import ray
    ray.init(address='ray://<head_node_ip_address>:10001')

...

然后我使用docker exec -it ... bash连接到容器,运行 python repl并尝试使用之前的ray输出建议的命令。

import ray
ray.init(address='auto', _redis_password='XXXXX')

结果

Traceback (most recent call last): File "", line 1, in File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper return func(*args, **kwargs) File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/worker.py", line 834, in init redis_address, _, _ = services.validate_redis_address(address) File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/services.py", line 375, in validate_redis_address address = find_redis_address_or_die() File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/services.py", line 287, in find_redis_address_or_die "Could not find any running Ray instance. " ConnectionError: Could not find any running Ray instance. Please specify the one to connect to by setting address.

尝试通过特定地址连接也没有成功。

ray.init(address='ray://localhost:10001')

[mutex.cc : 926] RAW: pthread_getschedparam failed: 1 Traceback (most recent call last): File "", line 1, in File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper return func(*args, **kwargs) File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/worker.py", line 775, in init return builder.connect() File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/client_builder.py", line 155, in connect ray_init_kwargs=self._remote_init_kwargs) File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client_connect.py", line 42, in connect ray_init_kwargs=ray_init_kwargs) File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/init.py", line 228, in connect conn = self.get_context().connect(*args, **kw_args) File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/init.py", line 88, in connect self.client_worker._server_init(job_config, ray_init_kwargs) File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/worker.py", line 698, in _server_init f"Initialization failure from server:\n{response.msg}") ConnectionAbortedError: Initialization failure from server: Traceback (most recent call last): File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/util/client/server/proxier.py", line 629, in Datapath "Starting Ray client server failed. See " RuntimeError: Starting Ray client server failed. See ray_client_server_23000.err for detailed logs.

我遇到了同样的问题,通过将 dashboard_host 设置为 0.0.0.0 解决了这个问题。

ray.init(dashboard_host="0.0.0.0",dashboard_port=6379)