cephadm:无法将节点添加到 ceph 集群(错误 EINVAL:无法连接到主机)
cephadm: Not able to add nodes to ceph cluster (Error EINVAL: Failed to connect to host)
我按照 https://docs.ceph.com/en/latest/cephadm/install/ 中的以下步骤在 Centos 8.1 上设置了一个 ceph 集群
curl --silent --remote-name --location https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm
chmod +x cephadm
./cephadm add-repo --release octopus
./cephadm install
执行上述命令后,我发现 ceph 需要 docker 或 podman 到 运行。所以我从 https://docs.docker.com/engine/install/centos/ 安装了 docker 的社区版本并继续下面的步骤。
./cephadm install
mkdir -p /etc/ceph
cephadm bootstrap --mon-ip *ip_of_the_current_machine (host1)*
cephadm install ceph-common
ssh-copy-id -f -i /etc/ceph/ceph.pub root@host2*
ceph orch host add host2
以上命令失败并出现错误
[root@host1 home]# ceph orch host add host2
INFO:cephadm:Inferring fsid 12345678-2345-6789-1011-000129110013
INFO:cephadm:Inferring config /var/lib/ceph/12345678-2345-6789-1011-000129110013/mon.host1/config
INFO:cephadm:Using recent ceph image ceph/ceph:v15
Error EINVAL: Failed to connect to host2 (host2).
Check that the host is reachable and accepts connections using the cephadm SSH key
you may want to run:
> ceph cephadm get-ssh-config > ssh_config
> ceph config-key get mgr/cephadm/ssh_identity_key > key
> ssh -F ssh_config -i key root@host2
我可以使用上述步骤登录到 host2。
有人可以告诉我是否做错了什么。我该如何解决这个问题。
所以经过几天的调试我发现 python3 在我想添加的节点上丢失了。我所要做的就是使用命令检查最后几条日志。
ceph log last cephadm
这给出了以下日志消息。
Traceback (most recent call last):
File "/usr/share/ceph/mgr/cephadm/module.py", line 1036, in _remote_connection
raise execnet.gateway_bootstrap.HostNotFound(msg)
execnet.gateway_bootstrap.HostNotFound: Can't communicate with remote host `host2`, possibly because python3 is not installed there: cannot send (already closed?)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 295, in _finalize
next_result = self._on_complete(self._value)
File "/usr/share/ceph/mgr/cephadm/module.py", line 103, in <lambda>
return CephadmCompletion(on_complete=lambda _: f(*args, **kwargs))
File "/usr/share/ceph/mgr/cephadm/module.py", line 1201, in add_host
return self._add_host(spec)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1187, in _add_host
error_ok=True, no_fsid=True)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1104, in _run_cephadm
with self._remote_connection(host, addr) as tpl:
File "/lib64/python3.6/contextlib.py", line 81, in __enter__
return next(self.gen)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1055, in _remote_connection
raise OrchestratorError(msg) from e
orchestrator._interface.OrchestratorError: Failed to connect to host2 (host2).
Check that the host is reachable and accepts connections using the cephadm SSH key
接下来添加我的节点运行。
ceph orch host add host2 ip_address
我遇到过同样的问题,但我最常收到的错误消息是
2021-01-13T15:21:13.071913+0000 mgr.ha1.qzzjzw (mgr.18492) 167366 : cephadm [ERR] _Promise failed
Traceback (most recent call last):
File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 48, in bootstrap_exec
s = io.read(1)
File "/lib/python3.6/site-packages/execnet/gateway_base.py", line 402, in read
raise EOFError("expected %d bytes, got %d" % (numbytes, len(buf)))
EOFError: expected 1 bytes, got 0
解决方法也对我有帮助
ceph orch host add host2 ip_address
我在 debian 10 上使用 cephadm 遇到了与 Oleg 相同的问题。
解决方法是添加 IP 地址。
sudo ./cephadm shell
ceph orch host add host2 ip_address
Added host 'host2'
我按照 https://docs.ceph.com/en/latest/cephadm/install/ 中的以下步骤在 Centos 8.1 上设置了一个 ceph 集群
curl --silent --remote-name --location https://github.com/ceph/ceph/raw/octopus/src/cephadm/cephadm
chmod +x cephadm
./cephadm add-repo --release octopus
./cephadm install
执行上述命令后,我发现 ceph 需要 docker 或 podman 到 运行。所以我从 https://docs.docker.com/engine/install/centos/ 安装了 docker 的社区版本并继续下面的步骤。
./cephadm install
mkdir -p /etc/ceph
cephadm bootstrap --mon-ip *ip_of_the_current_machine (host1)*
cephadm install ceph-common
ssh-copy-id -f -i /etc/ceph/ceph.pub root@host2*
ceph orch host add host2
以上命令失败并出现错误
[root@host1 home]# ceph orch host add host2
INFO:cephadm:Inferring fsid 12345678-2345-6789-1011-000129110013
INFO:cephadm:Inferring config /var/lib/ceph/12345678-2345-6789-1011-000129110013/mon.host1/config
INFO:cephadm:Using recent ceph image ceph/ceph:v15
Error EINVAL: Failed to connect to host2 (host2).
Check that the host is reachable and accepts connections using the cephadm SSH key
you may want to run:
> ceph cephadm get-ssh-config > ssh_config
> ceph config-key get mgr/cephadm/ssh_identity_key > key
> ssh -F ssh_config -i key root@host2
我可以使用上述步骤登录到 host2。 有人可以告诉我是否做错了什么。我该如何解决这个问题。
所以经过几天的调试我发现 python3 在我想添加的节点上丢失了。我所要做的就是使用命令检查最后几条日志。
ceph log last cephadm
这给出了以下日志消息。
Traceback (most recent call last):
File "/usr/share/ceph/mgr/cephadm/module.py", line 1036, in _remote_connection
raise execnet.gateway_bootstrap.HostNotFound(msg)
execnet.gateway_bootstrap.HostNotFound: Can't communicate with remote host `host2`, possibly because python3 is not installed there: cannot send (already closed?)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 295, in _finalize
next_result = self._on_complete(self._value)
File "/usr/share/ceph/mgr/cephadm/module.py", line 103, in <lambda>
return CephadmCompletion(on_complete=lambda _: f(*args, **kwargs))
File "/usr/share/ceph/mgr/cephadm/module.py", line 1201, in add_host
return self._add_host(spec)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1187, in _add_host
error_ok=True, no_fsid=True)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1104, in _run_cephadm
with self._remote_connection(host, addr) as tpl:
File "/lib64/python3.6/contextlib.py", line 81, in __enter__
return next(self.gen)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1055, in _remote_connection
raise OrchestratorError(msg) from e
orchestrator._interface.OrchestratorError: Failed to connect to host2 (host2).
Check that the host is reachable and accepts connections using the cephadm SSH key
接下来添加我的节点运行。
ceph orch host add host2 ip_address
我遇到过同样的问题,但我最常收到的错误消息是
2021-01-13T15:21:13.071913+0000 mgr.ha1.qzzjzw (mgr.18492) 167366 : cephadm [ERR] _Promise failed
Traceback (most recent call last):
File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 48, in bootstrap_exec
s = io.read(1)
File "/lib/python3.6/site-packages/execnet/gateway_base.py", line 402, in read
raise EOFError("expected %d bytes, got %d" % (numbytes, len(buf)))
EOFError: expected 1 bytes, got 0
解决方法也对我有帮助
ceph orch host add host2 ip_address
我在 debian 10 上使用 cephadm 遇到了与 Oleg 相同的问题。
解决方法是添加 IP 地址。
sudo ./cephadm shell
ceph orch host add host2 ip_address
Added host 'host2'