kubernetes 网络:pod 无法到达节点

kubernetes networking: pod cannot reach nodes

我有一个 kubernetes 集群,有 3 个 master 和 7 个 worker。我使用 Calico 作为 cni。当我部署 Calico 时,calico-kube-controllers-xxx 失败,因为它无法到达 10.96.0.1:443。

2020-06-23 13:05:28.737 [INFO][1] main.go 88: Loaded configuration from environment config=&config.Config{LogLevel:"info", WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:"", DatastoreType:"kubernetes"}
W0623 13:05:28.740128       1 client_config.go:541] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2020-06-23 13:05:28.742 [INFO][1] main.go 109: Ensuring Calico datastore is initialized
2020-06-23 13:05:38.742 [ERROR][1] client.go 261: Error getting cluster information config ClusterInformation="default" error=Get https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-06-23 13:05:38.742 [FATAL][1] main.go 114: Failed to initialize Calico datastore error=Get https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded

这是kube-system命名空间中的情况:

kubectl get po -n kube-system
NAME                                       READY   STATUS             RESTARTS   AGE
calico-kube-controllers-77d6cbc65f-6bmjg   0/1     CrashLoopBackOff   56         4h33m
calico-node-94pkr                          1/1     Running            0          36m
calico-node-d8vc4                          1/1     Running            0          36m
calico-node-fgpd4                          1/1     Running            0          37m
calico-node-jqgkp                          1/1     Running            0          37m
calico-node-m9lds                          1/1     Running            0          37m
calico-node-n5qmb                          1/1     Running            0          37m
calico-node-t46jb                          1/1     Running            0          36m
calico-node-w6xch                          1/1     Running            0          38m
calico-node-xpz8k                          1/1     Running            0          37m
calico-node-zbw4x                          1/1     Running            0          36m
coredns-5644d7b6d9-ms7gv                   0/1     Running            0          4h33m
coredns-5644d7b6d9-thwlz                   0/1     Running            0          4h33m
kube-apiserver-k8s01                       1/1     Running            7          34d
kube-apiserver-k8s02                       1/1     Running            9          34d
kube-apiserver-k8s03                       1/1     Running            7          34d
kube-controller-manager-k8s01              1/1     Running            7          34d
kube-controller-manager-k8s02              1/1     Running            9          34d
kube-controller-manager-k8s03              1/1     Running            8          34d
kube-proxy-9dppr                           1/1     Running            3          4d
kube-proxy-9hhm9                           1/1     Running            3          4d
kube-proxy-9svfk                           1/1     Running            1          4d
kube-proxy-jctxm                           1/1     Running            3          4d
kube-proxy-lsg7m                           1/1     Running            3          4d
kube-proxy-m257r                           1/1     Running            1          4d
kube-proxy-qtbbz                           1/1     Running            2          4d
kube-proxy-v958j                           1/1     Running            2          4d
kube-proxy-x97qx                           1/1     Running            2          4d
kube-proxy-xjkjl                           1/1     Running            3          4d
kube-scheduler-k8s01                       1/1     Running            7          34d
kube-scheduler-k8s02                       1/1     Running            9          34d
kube-scheduler-k8s03                       1/1     Running            8          34d

此外,coredns也无法获取内部kubernetes服务。

在一个节点内,如果我 运行 wget -S 10.96.0.1:443,我会收到一个响应。

wget -S 10.96.0.1:443
--2020-06-23 13:12:12--  http://10.96.0.1:443/
Connecting to 10.96.0.1:443... connected.
HTTP request sent, awaiting response...
  HTTP/1.0 400 Bad Request
2020-06-23 13:12:12 ERROR 400: Bad Request.

但是,如果我 运行 wget -S 10.96.0.1:443 在 pod 中,我会收到 超时错误

此外,我无法 ping 来自 pods 的节点。

集群 pod cidr 为 192.168.0.0/16。

我决定用不同的 pod cidr 重新创建集群