coredns 崩溃并出现错误 "Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/****: dial tcp 10.96.0.1:443: connect: no route to host"

Question

CoreDNS pod 不是运行。请找到以下状态。

kubectl get po --all-namespaces -o wide | grep -i coredns
kube-system            coredns-6955765f44-8qhkr                    1/1     Running            0          24m     10.244.0.59      k8s-master          <none>           <none>
kube-system            coredns-6955765f44-lpmjk                    0/1     Running            0          24m     10.244.1.43      k8s-worker-node-1   <none>           <none>

请在下面找到 pod 的日志。

kubectl logs coredns-6955765f44-lpmjk -n kube-system



E0420 03:43:03.855622       1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0420 03:43:03.855622       1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0420 03:43:03.855622       1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0420 03:43:03.855622       1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0420 03:43:05.859525       1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host
E0420 03:43:05.859525       1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host

Answer 1

要解决 CoreDNS no route to host 问题 pods 你必须通过运行:

刷新 iptables

systemctl stop kubelet
systemctl stop docker
iptables --flush
iptables -tnat --flush
systemctl start kubelet
systemctl start docker

另请注意，flannel 已从 kubeadm documentation 中的 CNI 列表中删除：

The reason for that is that Cluster Lifecycle have been getting a number of issues related to flannel (either in kubeadm or kops tickets) and we don't have good answers for the users as the project is not actively maintained. - Add note that issues for CNI should be logged in the respective issue trackers and that Calico is the only CNI we e2e test kubeadm against.

因此推荐的方法也将转移到 Calico CNI。

Answer 2

我在使用 K8s 1.19.7 和 flannel 时没有任何错误，当我升级到 1.21.1 时它开始显示上述错误并且以下修复对我有效

firewall-cmd --permanent --zone=trusted --add-source=10.244.0.0/16
firewall-cmd --reload

coredns 崩溃并出现错误 "Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/****: dial tcp 10.96.0.1:443: connect: no route to host"

coredns crashes with error "Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/****: dial tcp 10.96.0.1:443: connect: no route to host"

kubernetes

coredns

kubernetes-pod