来自 kubernetes api 服务的不一致响应,有时没有到主机错误的路由
Inconsistent response from kubernetes api service and getting no route to host error sometimes
我已经使用 kelsey tower 的 kubernetes 困难方法配置了 kubernetes 集群
不幸的是,当我点击 kubernetes 服务 ip 以检查来自工作节点的版本时,我看到不一致的响应
这是我的集群详细信息
root@kubem1:~# kubectl get no
NAME STATUS ROLES AGE VERSION
kubew1 Ready <none> 14h v1.18.3
kubew2 Ready <none> 14h v1.18.3
root@kubem1:~# kubectl get no -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kubew1 Ready <none> 14h v1.18.3 192.168.56.103 <none> Ubuntu 18.04.4 LTS 4.15.0-76-generic containerd://1.2.9
kubew2 Ready <none> 14h v1.18.3 192.168.56.104 <none> Ubuntu 18.04.4 LTS 4.15.0-76-generic containerd://1.2.9
root@kubem1:~# kubectl get svc -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kubernetes ClusterIP 10.32.0.1 <none> 443/TCP 21h <none>
root@kubem1:~# kubectl get po -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-589fff4ffc-mwrpk 1/1 Running 0 163m 10.200.1.5 kubew1 <none> <none>
coredns-589fff4ffc-qps68 1/1 Running 0 163m 10.200.2.3 kubew2 <none> <none>
root@kubem1:~#
来自工作节点,
Kube-proxy 系统配置
cat /etc/systemd/system/kube-proxy.service
[Unit]
Description=Kubernetes Kube Proxy
Documentation=https://github.com/kubernetes/kubernetes
[Service]
ExecStart=/usr/local/bin/kube-proxy \
--config=/var/lib/kube-proxy/kube-proxy-config.yaml
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
kube-proxy 配置 yaml 文件
cat /var/lib/kube-proxy/kube-proxy-config.yaml
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
clientConnection:
kubeconfig: "/var/lib/kube-proxy/kubeconfig"
mode: "iptables"
clusterCIDR: "10.200.0.0/16"
kube-proxy 服务状态
root@kubew2:~# service kube-proxy status
● kube-proxy.service - Kubernetes Kube Proxy
Loaded: loaded (/etc/systemd/system/kube-proxy.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2020-05-26 07:47:22 UTC; 9min ago
Docs: https://github.com/kubernetes/kubernetes
Main PID: 11502 (kube-proxy)
Tasks: 6 (limit: 1111)
CGroup: /system.slice/kube-proxy.service
└─11502 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/kube-proxy-config.yaml
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.697056 11502 node.go:136] Successfully retrieved node IP: 192.168.56.104
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.699467 11502 server_others.go:186] Using iptables Proxier.
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.699748 11502 server.go:583] Version: v1.18.3
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.700110 11502 conntrack.go:52] Setting nf_conntrack_max to 131072
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.702569 11502 config.go:315] Starting service config controller
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.702786 11502 shared_informer.go:223] Waiting for caches to sync for service config
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.702922 11502 config.go:133] Starting endpoints config controller
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.703039 11502 shared_informer.go:223] Waiting for caches to sync for endpoints config
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.803627 11502 shared_informer.go:230] Caches are synced for endpoints config
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.804515 11502 shared_informer.go:230] Caches are synced for service config
root@kubew2:~#
这是有问题的输出。2,3 次它给出正确的输出,之后它抛出错误,因为没有路由主机,它再次工作
root@kubew2:~# curl -k https://10.32.0.1:443/version
{
"major": "1",
"minor": "18",
"gitVersion": "v1.18.3",
"gitCommit": "2e7996e3e2712684bc73f0dec0200d64eec7fe40",
"gitTreeState": "clean",
"buildDate": "2020-05-20T12:43:34Z",
"goVersion": "go1.13.9",
"compiler": "gc",
"platform": "linux/amd64"
}root@kubew2:~# curl -k https://10.32.0.1:443/version
{
"major": "1",
"minor": "18",
"gitVersion": "v1.18.3",
"gitCommit": "2e7996e3e2712684bc73f0dec0200d64eec7fe40",
"gitTreeState": "clean",
"buildDate": "2020-05-20T12:43:34Z",
"goVersion": "go1.13.9",
"compiler": "gc",
"platform": "linux/amd64"
}
root@kubew2:~# curl -k https://10.32.0.1:443/version
curl: (7) Failed to connect to 10.32.0.1 port 443: No route to host
root@kubew2:~# curl -k https://10.32.0.1:443/version
{
"major": "1",
"minor": "18",
"gitVersion": "v1.18.3",
"gitCommit": "2e7996e3e2712684bc73f0dec0200d64eec7fe40",
"gitTreeState": "clean",
"buildDate": "2020-05-20T12:43:34Z",
"goVersion": "go1.13.9",
"compiler": "gc",
"platform": "linux/amd64"
我发现 issue.Since 它是高可用性设置,有 2 个节点(端点)api 服务,不幸的是另一个节点 192.168.56.102 - kube-apiserver无法连接该节点上 运行 的 etcd,每当 curl 命令命中解析为 192.168.56.102 的 kubernetes 服务 ip 时,我无法获得到主机的路由,因为它无法从节点 2 获取日期etcd 数据库
我已经从 kube-apiserver 命令行 arqs -
中删除了 etcd 第二个节点 etcd memeber(192.168.56.102:2380)
--etcd-servers=http://192.168.56.101:2379,http://192.168.56.102:2380
从 kubernetes 服务的端点中删除了第二个节点
root@kubem1:~# kubectl get ep
NAME ENDPOINTS AGE
kubernetes 192.168.56.101:6443,192.168.56.102:6443 22h
root@kubem1:~# kubectl edit ep kubernetes
endpoints/kubernetes edited
root@kubem1:~# kubectl get ep kubernetes
NAME ENDPOINTS AGE
kubernetes 192.168.56.101:6443 22h
现在我可以在没有路由到主机的情况下正确地获得 curl 输出
我已经使用 kelsey tower 的 kubernetes 困难方法配置了 kubernetes 集群
不幸的是,当我点击 kubernetes 服务 ip 以检查来自工作节点的版本时,我看到不一致的响应
这是我的集群详细信息
root@kubem1:~# kubectl get no
NAME STATUS ROLES AGE VERSION
kubew1 Ready <none> 14h v1.18.3
kubew2 Ready <none> 14h v1.18.3
root@kubem1:~# kubectl get no -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kubew1 Ready <none> 14h v1.18.3 192.168.56.103 <none> Ubuntu 18.04.4 LTS 4.15.0-76-generic containerd://1.2.9
kubew2 Ready <none> 14h v1.18.3 192.168.56.104 <none> Ubuntu 18.04.4 LTS 4.15.0-76-generic containerd://1.2.9
root@kubem1:~# kubectl get svc -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kubernetes ClusterIP 10.32.0.1 <none> 443/TCP 21h <none>
root@kubem1:~# kubectl get po -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-589fff4ffc-mwrpk 1/1 Running 0 163m 10.200.1.5 kubew1 <none> <none>
coredns-589fff4ffc-qps68 1/1 Running 0 163m 10.200.2.3 kubew2 <none> <none>
root@kubem1:~#
来自工作节点,
Kube-proxy 系统配置
cat /etc/systemd/system/kube-proxy.service
[Unit]
Description=Kubernetes Kube Proxy
Documentation=https://github.com/kubernetes/kubernetes
[Service]
ExecStart=/usr/local/bin/kube-proxy \
--config=/var/lib/kube-proxy/kube-proxy-config.yaml
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
kube-proxy 配置 yaml 文件
cat /var/lib/kube-proxy/kube-proxy-config.yaml
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
clientConnection:
kubeconfig: "/var/lib/kube-proxy/kubeconfig"
mode: "iptables"
clusterCIDR: "10.200.0.0/16"
kube-proxy 服务状态
root@kubew2:~# service kube-proxy status
● kube-proxy.service - Kubernetes Kube Proxy
Loaded: loaded (/etc/systemd/system/kube-proxy.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2020-05-26 07:47:22 UTC; 9min ago
Docs: https://github.com/kubernetes/kubernetes
Main PID: 11502 (kube-proxy)
Tasks: 6 (limit: 1111)
CGroup: /system.slice/kube-proxy.service
└─11502 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/kube-proxy-config.yaml
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.697056 11502 node.go:136] Successfully retrieved node IP: 192.168.56.104
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.699467 11502 server_others.go:186] Using iptables Proxier.
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.699748 11502 server.go:583] Version: v1.18.3
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.700110 11502 conntrack.go:52] Setting nf_conntrack_max to 131072
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.702569 11502 config.go:315] Starting service config controller
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.702786 11502 shared_informer.go:223] Waiting for caches to sync for service config
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.702922 11502 config.go:133] Starting endpoints config controller
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.703039 11502 shared_informer.go:223] Waiting for caches to sync for endpoints config
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.803627 11502 shared_informer.go:230] Caches are synced for endpoints config
May 26 07:47:22 kubew2 kube-proxy[11502]: I0526 07:47:22.804515 11502 shared_informer.go:230] Caches are synced for service config
root@kubew2:~#
这是有问题的输出。2,3 次它给出正确的输出,之后它抛出错误,因为没有路由主机,它再次工作
root@kubew2:~# curl -k https://10.32.0.1:443/version
{
"major": "1",
"minor": "18",
"gitVersion": "v1.18.3",
"gitCommit": "2e7996e3e2712684bc73f0dec0200d64eec7fe40",
"gitTreeState": "clean",
"buildDate": "2020-05-20T12:43:34Z",
"goVersion": "go1.13.9",
"compiler": "gc",
"platform": "linux/amd64"
}root@kubew2:~# curl -k https://10.32.0.1:443/version
{
"major": "1",
"minor": "18",
"gitVersion": "v1.18.3",
"gitCommit": "2e7996e3e2712684bc73f0dec0200d64eec7fe40",
"gitTreeState": "clean",
"buildDate": "2020-05-20T12:43:34Z",
"goVersion": "go1.13.9",
"compiler": "gc",
"platform": "linux/amd64"
}
root@kubew2:~# curl -k https://10.32.0.1:443/version
curl: (7) Failed to connect to 10.32.0.1 port 443: No route to host
root@kubew2:~# curl -k https://10.32.0.1:443/version
{
"major": "1",
"minor": "18",
"gitVersion": "v1.18.3",
"gitCommit": "2e7996e3e2712684bc73f0dec0200d64eec7fe40",
"gitTreeState": "clean",
"buildDate": "2020-05-20T12:43:34Z",
"goVersion": "go1.13.9",
"compiler": "gc",
"platform": "linux/amd64"
我发现 issue.Since 它是高可用性设置,有 2 个节点(端点)api 服务,不幸的是另一个节点 192.168.56.102 - kube-apiserver无法连接该节点上 运行 的 etcd,每当 curl 命令命中解析为 192.168.56.102 的 kubernetes 服务 ip 时,我无法获得到主机的路由,因为它无法从节点 2 获取日期etcd 数据库
我已经从 kube-apiserver 命令行 arqs -
中删除了 etcd 第二个节点 etcd memeber(192.168.56.102:2380)--etcd-servers=http://192.168.56.101:2379,http://192.168.56.102:2380
从 kubernetes 服务的端点中删除了第二个节点
root@kubem1:~# kubectl get ep
NAME ENDPOINTS AGE
kubernetes 192.168.56.101:6443,192.168.56.102:6443 22h
root@kubem1:~# kubectl edit ep kubernetes
endpoints/kubernetes edited
root@kubem1:~# kubectl get ep kubernetes
NAME ENDPOINTS AGE
kubernetes 192.168.56.101:6443 22h
现在我可以在没有路由到主机的情况下正确地获得 curl 输出