CoreDNS 在获取端点、服务、命名空间时遇到问题

CoreDNS has problems getting Endpoints, Services, Namespaces

我对 master 的 CoreDNS 有以下问题(另请参阅 ready is 0/1 on master):

E0321 22:54:45.590231       1 reflector.go:126] pkg/mod/k8s.io/client-go@v11.0.0+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.528164       1 reflector.go:126] pkg/mod/k8s.io/client-go@v11.0.0+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.528164       1 reflector.go:126] pkg/mod/k8s.io/client-go@v11.0.0+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.528164       1 reflector.go:126] pkg/mod/k8s.io/client-go@v11.0.0+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.528164       1 reflector.go:126] pkg/mod/k8s.io/client-go@v11.0.0+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.531540       1 reflector.go:126] pkg/mod/k8s.io/client-go@v11.0.0+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.531540       1 reflector.go:126] pkg/mod/k8s.io/client-go@v11.0.0+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.531540       1 reflector.go:126] pkg/mod/k8s.io/client-go@v11.0.0+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.531540       1 reflector.go:126] pkg/mod/k8s.io/client-go@v11.0.0+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.591304       1 reflector.go:126] pkg/mod/k8s.io/client-go@v11.0.0+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.591304       1 reflector.go:126] pkg/mod/k8s.io/client-go@v11.0.0+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.591304       1 reflector.go:126] pkg/mod/k8s.io/client-go@v11.0.0+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused
E0321 22:54:46.591304       1 reflector.go:126] pkg/mod/k8s.io/client-go@v11.0.0+incompatible/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: connection refused

其他一切似乎 运行 正常,我也可以从集群 nodes/pods 访问互联网

kube-system           coredns-776474d56-46fnz                        1/1     Running   0          2d23h   10.32.0.3       raspberrypi4-node     <none>           <none>
kube-system           coredns-776474d56-7nlw4                        0/1     Running   0          32h     10.36.0.1       raspberrypi4-master   <none>           <none>
kube-system           etcd-raspberrypi4-master                       1/1     Running   6          3d22h   192.168.0.192   raspberrypi4-master   <none>           <none>
kube-system           kube-apiserver-raspberrypi4-master             1/1     Running   4          3d22h   192.168.0.192   raspberrypi4-master   <none>           <none>
kube-system           kube-controller-manager-raspberrypi4-master    1/1     Running   9          3d22h   192.168.0.192   raspberrypi4-master   <none>           <none>
kube-system           kube-proxy-6vgm9                               1/1     Running   0          3d13h   192.168.0.157   raspberrypi3-node     <none>           <none>
kube-system           kube-proxy-vqqv7                               1/1     Running   5          3d22h   192.168.0.192   raspberrypi4-master   <none>           <none>
kube-system           kube-proxy-wj784                               1/1     Running   0          3d21h   192.168.0.90    raspberrypi4-node     <none>           <none>
kube-system           kube-scheduler-raspberrypi4-master             1/1     Running   9          3d22h   192.168.0.192   raspberrypi4-master   <none>           <none>
kube-system           weave-net-6db56                                2/2     Running   0          3d9h    192.168.0.90    raspberrypi4-node     <none>           <none>
kube-system           weave-net-7t7t6                                2/2     Running   0          3d9h    192.168.0.192   raspberrypi4-master   <none>           <none>
kube-system           weave-net-mg79s                                2/2     Running   0          3d9h    192.168.0.157   raspberrypi3-node     <none>           <none>

我已经检查了文档,一些端口没有打开,但这是对端口 443 的访问,这是一种系统特权端口,所以我想知道在这种情况下我是否需要提供对 kubernetes 的访问权限端口(并可能将其转发到 6443,在文档中是 Kubernetes API 服务器)。我还将从集群外部访问该端口,并希望 kubernetes 服务能够处理它,并且希望能通过一个简单的命令将 80 和 443 端口转发到该端口。

我刚刚注意到服务确实在监听正确的 IP/port,所以不知道为什么它拒绝连接。

$ kubectl get svc -A
NAMESPACE     NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
default       kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP                  3d22h
kube-system   kube-dns     ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   3d22h

问题出在 iptables 上。

  1. 确保在每个节点的 linux 内核上启用 ip 转发。 执行命令: $ sysctl net.ipv4.conf.all.forwarding = 1

  2. 如果你的docker版本>=1.13,默认的FORWARD链策略被丢弃,你必须设置FORWARD 链接到 ACCEPT. 的默认策略 执行命令: $ sudo iptables -P FORWARD ACCEPT.

  3. 最后使用标志 cluster-cidr:
    通过 kube-proxy 配置 --cluster-cidr=.

    --cluster-cidr flag 表示:

CIDR Range for Pods in cluster. Requires --allocate-node-cidrs to be true.

如果不提供,则不会执行集群外桥接。
类似问题:kubernetes-coredns-issue.

如果有帮助,请告诉我。

接受的答案没有解决我的问题。如果有人遇到类似问题,重启 coredns 解决了我的问题。

kubectl rollout restart deployment coredns --namespace kube-system