kubernetes 网络:pod 无法到达节点
kubernetes networking: pod cannot reach nodes
我有一个 kubernetes 集群,有 3 个 master 和 7 个 worker。我使用 Calico 作为 cni。当我部署 Calico 时,calico-kube-controllers-xxx 失败,因为它无法到达 10.96.0.1:443。
2020-06-23 13:05:28.737 [INFO][1] main.go 88: Loaded configuration from environment config=&config.Config{LogLevel:"info", WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:"", DatastoreType:"kubernetes"}
W0623 13:05:28.740128 1 client_config.go:541] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2020-06-23 13:05:28.742 [INFO][1] main.go 109: Ensuring Calico datastore is initialized
2020-06-23 13:05:38.742 [ERROR][1] client.go 261: Error getting cluster information config ClusterInformation="default" error=Get https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-06-23 13:05:38.742 [FATAL][1] main.go 114: Failed to initialize Calico datastore error=Get https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
这是kube-system命名空间中的情况:
kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-77d6cbc65f-6bmjg 0/1 CrashLoopBackOff 56 4h33m
calico-node-94pkr 1/1 Running 0 36m
calico-node-d8vc4 1/1 Running 0 36m
calico-node-fgpd4 1/1 Running 0 37m
calico-node-jqgkp 1/1 Running 0 37m
calico-node-m9lds 1/1 Running 0 37m
calico-node-n5qmb 1/1 Running 0 37m
calico-node-t46jb 1/1 Running 0 36m
calico-node-w6xch 1/1 Running 0 38m
calico-node-xpz8k 1/1 Running 0 37m
calico-node-zbw4x 1/1 Running 0 36m
coredns-5644d7b6d9-ms7gv 0/1 Running 0 4h33m
coredns-5644d7b6d9-thwlz 0/1 Running 0 4h33m
kube-apiserver-k8s01 1/1 Running 7 34d
kube-apiserver-k8s02 1/1 Running 9 34d
kube-apiserver-k8s03 1/1 Running 7 34d
kube-controller-manager-k8s01 1/1 Running 7 34d
kube-controller-manager-k8s02 1/1 Running 9 34d
kube-controller-manager-k8s03 1/1 Running 8 34d
kube-proxy-9dppr 1/1 Running 3 4d
kube-proxy-9hhm9 1/1 Running 3 4d
kube-proxy-9svfk 1/1 Running 1 4d
kube-proxy-jctxm 1/1 Running 3 4d
kube-proxy-lsg7m 1/1 Running 3 4d
kube-proxy-m257r 1/1 Running 1 4d
kube-proxy-qtbbz 1/1 Running 2 4d
kube-proxy-v958j 1/1 Running 2 4d
kube-proxy-x97qx 1/1 Running 2 4d
kube-proxy-xjkjl 1/1 Running 3 4d
kube-scheduler-k8s01 1/1 Running 7 34d
kube-scheduler-k8s02 1/1 Running 9 34d
kube-scheduler-k8s03 1/1 Running 8 34d
此外,coredns也无法获取内部kubernetes服务。
在一个节点内,如果我 运行 wget -S 10.96.0.1:443
,我会收到一个响应。
wget -S 10.96.0.1:443
--2020-06-23 13:12:12-- http://10.96.0.1:443/
Connecting to 10.96.0.1:443... connected.
HTTP request sent, awaiting response...
HTTP/1.0 400 Bad Request
2020-06-23 13:12:12 ERROR 400: Bad Request.
但是,如果我 运行 wget -S 10.96.0.1:443
在 pod 中,我会收到 超时错误 。
此外,我无法 ping 来自 pods 的节点。
集群 pod cidr 为 192.168.0.0/16。
我决定用不同的 pod cidr 重新创建集群
我有一个 kubernetes 集群,有 3 个 master 和 7 个 worker。我使用 Calico 作为 cni。当我部署 Calico 时,calico-kube-controllers-xxx 失败,因为它无法到达 10.96.0.1:443。
2020-06-23 13:05:28.737 [INFO][1] main.go 88: Loaded configuration from environment config=&config.Config{LogLevel:"info", WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:"", DatastoreType:"kubernetes"}
W0623 13:05:28.740128 1 client_config.go:541] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2020-06-23 13:05:28.742 [INFO][1] main.go 109: Ensuring Calico datastore is initialized
2020-06-23 13:05:38.742 [ERROR][1] client.go 261: Error getting cluster information config ClusterInformation="default" error=Get https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-06-23 13:05:38.742 [FATAL][1] main.go 114: Failed to initialize Calico datastore error=Get https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
这是kube-system命名空间中的情况:
kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-77d6cbc65f-6bmjg 0/1 CrashLoopBackOff 56 4h33m
calico-node-94pkr 1/1 Running 0 36m
calico-node-d8vc4 1/1 Running 0 36m
calico-node-fgpd4 1/1 Running 0 37m
calico-node-jqgkp 1/1 Running 0 37m
calico-node-m9lds 1/1 Running 0 37m
calico-node-n5qmb 1/1 Running 0 37m
calico-node-t46jb 1/1 Running 0 36m
calico-node-w6xch 1/1 Running 0 38m
calico-node-xpz8k 1/1 Running 0 37m
calico-node-zbw4x 1/1 Running 0 36m
coredns-5644d7b6d9-ms7gv 0/1 Running 0 4h33m
coredns-5644d7b6d9-thwlz 0/1 Running 0 4h33m
kube-apiserver-k8s01 1/1 Running 7 34d
kube-apiserver-k8s02 1/1 Running 9 34d
kube-apiserver-k8s03 1/1 Running 7 34d
kube-controller-manager-k8s01 1/1 Running 7 34d
kube-controller-manager-k8s02 1/1 Running 9 34d
kube-controller-manager-k8s03 1/1 Running 8 34d
kube-proxy-9dppr 1/1 Running 3 4d
kube-proxy-9hhm9 1/1 Running 3 4d
kube-proxy-9svfk 1/1 Running 1 4d
kube-proxy-jctxm 1/1 Running 3 4d
kube-proxy-lsg7m 1/1 Running 3 4d
kube-proxy-m257r 1/1 Running 1 4d
kube-proxy-qtbbz 1/1 Running 2 4d
kube-proxy-v958j 1/1 Running 2 4d
kube-proxy-x97qx 1/1 Running 2 4d
kube-proxy-xjkjl 1/1 Running 3 4d
kube-scheduler-k8s01 1/1 Running 7 34d
kube-scheduler-k8s02 1/1 Running 9 34d
kube-scheduler-k8s03 1/1 Running 8 34d
此外,coredns也无法获取内部kubernetes服务。
在一个节点内,如果我 运行 wget -S 10.96.0.1:443
,我会收到一个响应。
wget -S 10.96.0.1:443
--2020-06-23 13:12:12-- http://10.96.0.1:443/
Connecting to 10.96.0.1:443... connected.
HTTP request sent, awaiting response...
HTTP/1.0 400 Bad Request
2020-06-23 13:12:12 ERROR 400: Bad Request.
但是,如果我 运行 wget -S 10.96.0.1:443
在 pod 中,我会收到 超时错误 。
此外,我无法 ping 来自 pods 的节点。
集群 pod cidr 为 192.168.0.0/16。
我决定用不同的 pod cidr 重新创建集群