Rancher 2 集群设置不工作,发现多个错误

Rancher 2 cluster setup not working, multiple errors found

我正在尝试创建一个添加 2 个自定义 VM 的集群。

我通过为每个节点(etcd、controlpane 和 worker)设置名称和定义角色来创建集群,然后在每个节点中执行命令。

等待几分钟后,我看到以下错误:

[[network] Host [X.Y.Z.14] is not able to connect to the following ports: [X.Y.Z.10:2379, X.Y.Z.10:2380]. Please check network policies and firewall rules]

这些 IP 地址是要添加到集群的节点的 IP 地址。服务器 IP 是 X.Y.Z.9 并且有 none 个这些角色。

所有 3 个虚拟机(服务器和工作节点)都是新安装的 CentOS 7。我在启用 SELINUX 的情况下完成了此设置,但我也尝试在所有 3 个虚拟机上禁用它以进行测试,只是为了检查这是否是SELINUX 和 Rancher 的问题。

我是不是漏掉了一步?我应该在哪里调查?我查看了rancher server容器的日志,这里是部分日志:

2019/12/02 12:10:26 [INFO] kontainerdriver rancherkubernetesengine stopped
2019/12/02 12:10:26 [ERROR] ClusterController c-mb7xc [cluster-provisioner-controller] failed with : [[network] Host [X.Y.Z.14] is not able to connect to the following ports: [X.Y.Z.10:2379, X.Y.Z.10:2380]. Please check network policies and firewall rules]
2019-12-02 12:13:26.885195 I | mvcc: store.index: compact 115706
2019-12-02 12:13:26.886955 I | mvcc: finished scheduled compaction at 115706 (took 1.379118ms)
2019/12/02 12:14:26 [INFO] Provisioning cluster [c-mb7xc]
2019/12/02 12:14:26 [INFO] Creating cluster [c-mb7xc]
2019/12/02 12:14:31 [INFO] kontainerdriver rancherkubernetesengine listening on address 127.0.0.1:42728
2019/12/02 12:14:31 [ERROR] Cluster c-mb7xc previously failed to create
2019/12/02 12:14:31 [INFO] cluster [c-mb7xc] provisioning: Initiating Kubernetes cluster
2019/12/02 12:14:31 [INFO] cluster [c-mb7xc] provisioning: [certificates] Generating admin certificates and kubeconfig
2019/12/02 12:14:31 [INFO] cluster [c-mb7xc] provisioning: Successfully Deployed state file at [management-state/rke/rke-770316984/cluster.rkestate]
2019/12/02 12:14:31 [INFO] cluster [c-mb7xc] provisioning: Building Kubernetes cluster
2019/12/02 12:14:31 [INFO] cluster [c-mb7xc] provisioning: [dialer] Setup tunnel for host [X.Y.Z.14]
2019/12/02 12:14:31 [INFO] [network] Starting stopped container [rke-etcd-port-listener] on host [X.Y.Z.10]
2019/12/02 12:14:31 [INFO] Starting container [rke-etcd-port-listener] on host [X.Y.Z.10], try #1
2019/12/02 12:14:31 [INFO] [network] Starting stopped container [rke-etcd-port-listener] on host [X.Y.Z.14]
2019/12/02 12:14:31 [INFO] Starting container [rke-etcd-port-listener] on host [X.Y.Z.14], try #1
2019/12/02 12:14:31 [INFO] cluster [c-mb7xc] provisioning: [dialer] Setup tunnel for host [X.Y.Z.10]
2019/12/02 12:14:31 [INFO] cluster [c-mb7xc] provisioning: [network] Deploying port listener containers
2019/12/02 12:14:31 [WARNING] Can't start Docker container [rke-etcd-port-listener] on host [X.Y.Z.10]: Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (5a38613b1495ef436cd7842ade853e6f2a11948f5f00f0d2a0ff0d57e83aa115): Error starting userland proxy: listen tcp 0.0.0.0:2380: bind: address already in use
2019/12/02 12:14:31 [INFO] Starting container [rke-etcd-port-listener] on host [X.Y.Z.10], try #2
2019/12/02 12:14:31 [WARNING] Can't start Docker container [rke-etcd-port-listener] on host [X.Y.Z.14]: Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (445b5b6cbaf4a2078f15d44741b91245d4f63288bb1ad3894787f9060ada4e33): Error starting userland proxy: listen tcp 0.0.0.0:2380: bind: address already in use
2019/12/02 12:14:31 [INFO] Starting container [rke-etcd-port-listener] on host [X.Y.Z.14], try #2
2019/12/02 12:14:32 [WARNING] Can't start Docker container [rke-etcd-port-listener] on host [X.Y.Z.10]: Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (d5a2cfc270aab68cee979b2fe1705a2ff574ba167f0ad011d2626e4edc94ac01): Error starting userland proxy: listen tcp 0.0.0.0:2380: bind: address already in use
2019/12/02 12:14:32 [INFO] Starting container [rke-etcd-port-listener] on host [X.Y.Z.10], try #3
2019/12/02 12:14:32 [WARNING] Can't start Docker container [rke-etcd-port-listener] on host [X.Y.Z.14]: Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (1e7463f5c50b7d967824a380695cbf6f73e1c8f13368c6e12712330e64d6a358): Error starting userland proxy: listen tcp 0.0.0.0:2380: bind: address already in use
2019/12/02 12:14:32 [INFO] Starting container [rke-etcd-port-listener] on host [X.Y.Z.14], try #3
2019/12/02 12:14:32 [WARNING] Can't start Docker container [rke-etcd-port-listener] on host [X.Y.Z.10]: Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (81941519760f80c47c05f5a44c8076adfb796a6675201614931b51bbb7b63714): Error starting userland proxy: listen tcp 0.0.0.0:2380: bind: address already in use
2019/12/02 12:14:32 [WARNING] Can't start Docker container [rke-etcd-port-listener] on host [X.Y.Z.14]: Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (9fff51bada3afffc9c14a7c5ddf5f25e889b71a2d128dca8d7cda8c56fa7fed4): Error starting userland proxy: listen tcp 0.0.0.0:2380: bind: address already in use
2019/12/02 12:14:32 [INFO] cluster [c-mb7xc] provisioning: [network] Port listener containers deployed successfully
2019/12/02 12:14:32 [INFO] Image [rancher/rke-tools:v0.1.51] exists on host [X.Y.Z.14]
2019/12/02 12:14:32 [INFO] Image [rancher/rke-tools:v0.1.51] exists on host [X.Y.Z.10]
2019/12/02 12:14:32 [INFO] cluster [c-mb7xc] provisioning: [network] Running etcd <-> etcd port checks
2019/12/02 12:14:32 [INFO] Starting container [rke-port-checker] on host [X.Y.Z.14], try #1
2019/12/02 12:14:32 [INFO] cluster [c-mb7xc] provisioning: [network] Successfully started [rke-port-checker] container on host [X.Y.Z.14]
2019/12/02 12:14:32 [INFO] Removing container [rke-port-checker] on host [X.Y.Z.14], try #1
2019/12/02 12:14:32 [INFO] Starting container [rke-port-checker] on host [X.Y.Z.10], try #1
2019/12/02 12:14:32 [INFO] cluster [c-mb7xc] provisioning: [network] Successfully started [rke-port-checker] container on host [X.Y.Z.10]
2019/12/02 12:14:38 [INFO] Removing container [rke-port-checker] on host [X.Y.Z.10], try #1
2019/12/02 12:14:38 [ERROR] cluster [c-mb7xc] provisioning: [[network] Host [X.Y.Z.14] is not able to connect to the following ports: [X.Y.Z.10:2379, X.Y.Z.10:2380]. Please check network policies and firewall rules]
2019/12/02 12:14:38 [INFO] kontainerdriver rancherkubernetesengine stopped
2019/12/02 12:14:38 [ERROR] ClusterController c-mb7xc [cluster-provisioner-controller] failed with : [[network] Host [X.Y.Z.14] is not able to connect to the following ports: [X.Y.Z.10:2379, X.Y.Z.10:2380]. Please check network policies and firewall rules]
2019-12-02 12:18:26.889193 I | mvcc: store.index: compact 116351
2019-12-02 12:18:26.890642 I | mvcc: finished scheduled compaction at 116351 (took 1.10593ms)
2019/12/02 12:22:38 [INFO] Provisioning cluster [c-mb7xc]
2019/12/02 12:22:38 [INFO] Creating cluster [c-mb7xc]
2019/12/02 12:22:43 [INFO] kontainerdriver rancherkubernetesengine listening on address 127.0.0.1:33176
2019/12/02 12:22:43 [ERROR] Cluster c-mb7xc previously failed to create
2019/12/02 12:22:43 [INFO] cluster [c-mb7xc] provisioning: Initiating Kubernetes cluster
2019/12/02 12:22:43 [INFO] cluster [c-mb7xc] provisioning: [certificates] Generating admin certificates and kubeconfig
2019/12/02 12:22:43 [INFO] cluster [c-mb7xc] provisioning: Successfully Deployed state file at [management-state/rke/rke-153618103/cluster.rkestate]
2019/12/02 12:22:43 [INFO] cluster [c-mb7xc] provisioning: Building Kubernetes cluster
2019/12/02 12:22:43 [INFO] cluster [c-mb7xc] provisioning: [dialer] Setup tunnel for host [X.Y.Z.10]
2019/12/02 12:22:43 [INFO] cluster [c-mb7xc] provisioning: [dialer] Setup tunnel for host [X.Y.Z.14]
2019/12/02 12:22:43 [INFO] [network] Starting stopped container [rke-etcd-port-listener] on host [X.Y.Z.14]
2019/12/02 12:22:43 [INFO] Starting container [rke-etcd-port-listener] on host [X.Y.Z.14], try #1
2019/12/02 12:22:43 [INFO] [network] Starting stopped container [rke-etcd-port-listener] on host [X.Y.Z.10]
2019/12/02 12:22:43 [INFO] Starting container [rke-etcd-port-listener] on host [X.Y.Z.10], try #1
2019/12/02 12:22:43 [INFO] cluster [c-mb7xc] provisioning: [network] Deploying port listener containers
2019/12/02 12:22:43 [WARNING] Can't start Docker container [rke-etcd-port-listener] on host [X.Y.Z.14]: Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (7ef490bf1c3963f131972836836d7f01acf0a7f9f808eede2cf19e57e4b3c62c): Error starting userland proxy: listen tcp 0.0.0.0:2380: bind: address already in use
2019/12/02 12:22:43 [INFO] Starting container [rke-etcd-port-listener] on host [X.Y.Z.14], try #2
2019/12/02 12:22:43 [WARNING] Can't start Docker container [rke-etcd-port-listener] on host [X.Y.Z.10]: Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (f42e672d345f9468871fcc130c432885dde17b70bda4f2dc23d1f7f443ecac6e): Error starting userland proxy: listen tcp 0.0.0.0:2380: bind: address already in use
2019/12/02 12:22:43 [INFO] Starting container [rke-etcd-port-listener] on host [X.Y.Z.10], try #2
2019/12/02 12:22:43 [WARNING] Can't start Docker container [rke-etcd-port-listener] on host [X.Y.Z.10]: Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (11c409cf4e33232e2c5d39ae60981620793ca531482a8f09677e7c3e47750df6): Error starting userland proxy: listen tcp 0.0.0.0:2380: bind: address already in use
2019/12/02 12:22:43 [INFO] Starting container [rke-etcd-port-listener] on host [X.Y.Z.10], try #3
2019/12/02 12:22:43 [WARNING] Can't start Docker container [rke-etcd-port-listener] on host [X.Y.Z.14]: Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (26df18e96a9a328a390a2d3a832cf665c7ef455e46058b0748f0e63e6c356612): Error starting userland proxy: listen tcp 0.0.0.0:2380: bind: address already in use
2019/12/02 12:22:43 [INFO] Starting container [rke-etcd-port-listener] on host [X.Y.Z.14], try #3
2019/12/02 12:22:44 [WARNING] Can't start Docker container [rke-etcd-port-listener] on host [X.Y.Z.10]: Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (142ef4546b5b9afb113ef7282970e84dce1131dce21e32caafde54d870838792): Error starting userland proxy: listen tcp 0.0.0.0:2380: bind: address already in use
2019/12/02 12:22:44 [WARNING] Can't start Docker container [rke-etcd-port-listener] on host [X.Y.Z.14]: Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (8e0eda16eb1e99088e4bd2dd3f5134bf6230fdc03dd10aac24c76e6d71826ac3): Error starting userland proxy: listen tcp 0.0.0.0:2380: bind: address already in use
2019/12/02 12:22:44 [INFO] cluster [c-mb7xc] provisioning: [network] Port listener containers deployed successfully
2019/12/02 12:22:44 [INFO] Image [rancher/rke-tools:v0.1.51] exists on host [X.Y.Z.14]
2019/12/02 12:22:44 [INFO] Image [rancher/rke-tools:v0.1.51] exists on host [X.Y.Z.10]
2019/12/02 12:22:44 [INFO] cluster [c-mb7xc] provisioning: [network] Running etcd <-> etcd port checks
2019/12/02 12:22:44 [INFO] Starting container [rke-port-checker] on host [X.Y.Z.14], try #1
2019/12/02 12:22:44 [INFO] cluster [c-mb7xc] provisioning: [network] Successfully started [rke-port-checker] container on host [X.Y.Z.14]
2019/12/02 12:22:44 [INFO] Starting container [rke-port-checker] on host [X.Y.Z.10], try #1
2019/12/02 12:22:44 [INFO] Removing container [rke-port-checker] on host [X.Y.Z.14], try #1
2019/12/02 12:22:44 [INFO] cluster [c-mb7xc] provisioning: [network] Successfully started [rke-port-checker] container on host [X.Y.Z.10]
2019/12/02 12:22:49 [INFO] Removing container [rke-port-checker] on host [X.Y.Z.10], try #1
2019/12/02 12:22:49 [ERROR] cluster [c-mb7xc] provisioning: [[network] Host [X.Y.Z.14] is not able to connect to the following ports: [X.Y.Z.10:2379, X.Y.Z.10:2380]. Please check network policies and firewall rules]
2019/12/02 12:22:49 [INFO] kontainerdriver rancherkubernetesengine stopped
2019/12/02 12:22:49 [ERROR] ClusterController c-mb7xc [cluster-provisioner-controller] failed with : [[network] Host [X.Y.Z.14] is not able to connect to the following ports: [X.Y.Z.10:2379, X.Y.Z.10:2380]. Please check network policies and firewall rules]

所有警告都显示 listen tcp 0.0.0.0:2380: bind: address already in use。我建议您查看该端口上是否已经有服务 运行。如果没有,请查看是否有容器(可能此时已停止)将此端口绑定到自身。

使用 docker container ls -a 列出所有容器,包括非 运行 的容器。 如果您使用 Linux,请使用 netstat -tulpen | grep 2380 列出端口 2380 上的服务 运行。 对于 Windows,在命令提示符中使用 netstat -an | findstr 2380 来执行相同的操作。

从头开始使用 Ubuntu 和更新版本的 Rancher 后,我能够解决这个问题。

我不认为操作系统是这里的问题,但 rancher 版本中存在一个已知问题。