kubespray 在进程中间停止,https://127.0.0.1:6443/healthz,请求失败:<urlopen error Tunnel connection failed: 403 Forbidden>”

kubespray stops in the middle of the process, https://127.0.0.1:6443/healthz, Request failed: <urlopen error Tunnel connection failed: 403 Forbidden>"

我想通过 Kubespray 在 3 个 Master,3 个 ETCD 和 2 个节点上安装 Kubernetes。但是 kubespray playbook 在过程中间停止了。 有一次,它打印了这条消息,但进程继续:

TASK [kubernetes/kubeadm : Join to cluster with ignores] *
fatal: [lsrv-k8s-node1]: FAILED! => {"changed": true, "cmd": ["timeout", "-k", "120s", "120s", "/usr/local/bin/kubeadm", "join", "config", "/etc/kubernetes/kubeadm-client.conf", "ignore-preflight-errors=all"], "delta": "0:01:03.639553", "end": "2020-04-25 23:08:51.163709", "msg": "non-zero return code", "rc": 1, "start": "2020-04-25 23:07:47.524156", "stderr": "W0425 23:07:47.569297   49639 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.\nW0425 23:07:47.570267   49639 common.go:77] your configuration file uses a deprecated API spec: \"kubeadm.k8s.io/v1beta1\". Please use 'kubeadm config migrate old-config old.yaml new-config new.yaml', which will write the new, similar spec using a newer API version.\n\t[WARNING DirAvailableetc-kubernetes-manifests]: /etc/kubernetes/manifests is not empty\n\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/\n\t[WARNING HTTPProxy]: Connection to \"https://192.168.72.133\" uses proxy \"https://192.168.70.145:3128\". If that is not intended, adjust your proxy settings\nerror execution phase preflight: couldn't validate the identity of the API Server: Get https://192.168.72.133:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s: proxyconnect tcp: tls: first record does not look like a TLS handshake\nTo see the stack trace of this error execute with v=5 or higher", "stderr_lines": ["W0425 23:07:47.569297   49639 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.", "W0425 23:07:47.570267   49639 common.go:77] your configuration file uses a deprecated API spec: \"kubeadm.k8s.io/v1beta1\". Please use 'kubeadm config migrate old-config old.yaml new-config new.yaml', which will write the new, similar spec using a newer API version.", "\t[WARNING DirAvailableetc-kubernetes-manifests]: /etc/kubernetes/manifests is not empty", "\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/", "\t[WARNING HTTPProxy]: Connection to \"https://192.168.72.133\" uses proxy \"https://192.168.70.145:3128\". If that is not intended, adjust your proxy settings", "error execution phase preflight: couldn't validate the identity of the API Server: Get https://192.168.72.133:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s: proxyconnect tcp: tls: first record does not look like a TLS handshake", "To see the stack trace of this error execute with v=5 or higher"], "stdout": "[preflight] Running pre-flight checks", "stdout_lines": ["[preflight] Running pre-flight checks"]}
fatal: [lsrv-k8s-node2]: FAILED! => {"changed": true, "cmd": ["timeout", "-k", "120s", "120s", "/usr/local/bin/kubeadm", "join", "config", "/etc/kubernetes/kubeadm-client.conf", "ignore-preflight-errors=all"], "delta": "0:01:03.644100", "end": "2020-04-25 23:08:51.182100", "msg": "non-zero return code", "rc": 1, "start": "2020-04-25 23:07:47.538000", "stderr": "W0425 23:07:47.583487   30148 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.\nW0425 23:07:47.584414   30148 common.go:77] your configuration file uses a deprecated API spec: \"kubeadm.k8s.io/v1beta1\". Please use 'kubeadm config migrate old-config old.yaml new-config new.yaml', which will write the new, similar spec using a newer API version.\n\t[WARNING DirAvailableetc-kubernetes-manifests]: /etc/kubernetes/manifests is not empty\n\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/\n\t[WARNING HTTPProxy]: Connection to \"https://192.168.72.133\" uses proxy \"https://192.168.70.145:3128\". If that is not intended, adjust your proxy settings\nerror execution phase preflight: couldn't validate the identity of the API Server: Get https://192.168.72.133:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s: proxyconnect tcp: tls: first record does not look like a TLS handshake\nTo see the stack trace of this error execute with v=5 or higher", "stderr_lines": ["W0425 23:07:47.583487   30148 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.", "W0425 23:07:47.584414   30148 common.go:77] your configuration file uses a deprecated API spec: \"kubeadm.k8s.io/v1beta1\". Please use 'kubeadm config migrate old-config old.yaml new-config new.yaml', which will write the new, similar spec using a newer API version.", "\t[WARNING DirAvailableetc-kubernetes-manifests]: /etc/kubernetes/manifests is not empty", "\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/", "\t[WARNING HTTPProxy]: Connection to \"https://192.168.72.133\" uses proxy \"https://192.168.70.145:3128\". If that is not intended, adjust your proxy settings", "error execution phase preflight: couldn't validate the identity of the API Server: Get https://192.168.72.133:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s: proxyconnect tcp: tls: first record does not look like a TLS handshake", "To see the stack trace of this error execute with v=5 or higher"], "stdout": "[preflight] Running pre-flight checks", "stdout_lines": ["[preflight] Running pre-flight checks"]}
Saturday 25 April 2020  23:08:51 +0430 (0:01:03.866)       0:06:53.654  

TASK [kubernetes/kubeadm : Display kubeadm join stderr if any] *
ok: [lsrv-k8s-node1] => {
    "msg": "Joined with warnings\n['W0425 23:07:47.569297   49639 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.', 'W0425 23:07:47.570267   49639 common.go:77] your configuration file uses a deprecated API spec: \"kubeadm.k8s.io/v1beta1\". Please use \'kubeadm config migrate old-config old.yaml new-config new.yaml\', which will write the new, similar spec using a newer API version.', '\t[WARNING DirAvailableetc-kubernetes-manifests]: /etc/kubernetes/manifests is not empty', '\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/', '\t[WARNING HTTPProxy]: Connection to \"https://192.168.72.133\" uses proxy \"https://192.168.70.145:3128\". If that is not intended, adjust your proxy settings', \"error execution phase preflight: couldn't validate the identity of the API Server: Get https://192.168.72.133:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s: proxyconnect tcp: tls: first record does not look like a TLS handshake\", 'To see the stack trace of this error execute with v=5 or higher']\n"
}
ok: [lsrv-k8s-node2] => {
    "msg": "Joined with warnings\n['W0425 23:07:47.583487   30148 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.', 'W0425 23:07:47.584414   30148 common.go:77] your configuration file uses a deprecated API spec: \"kubeadm.k8s.io/v1beta1\". Please use \'kubeadm config migrate old-config old.yaml new-config new.yaml\', which will write the new, similar spec using a newer API version.', '\t[WARNING DirAvailableetc-kubernetes-manifests]: /etc/kubernetes/manifests is not empty', '\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/', '\t[WARNING HTTPProxy]: Connection to \"https://192.168.72.133\" uses proxy \"https://192.168.70.145:3128\". If that is not intended, adjust your proxy settings', \"error execution phase preflight: couldn't validate the identity of the API Server: Get https://192.168.72.133:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s: proxyconnect tcp: tls: first record does not look like a TLS handshake\", 'To see the stack trace of this error execute with v=5 or higher']\n"
}
Saturday 25 April 2020  23:08:51 +0430 (0:00:00.082)       0:06:53.737  
Saturday 25 April 2020  23:08:51 +0430 (0:00:00.050)       0:06:53.787  

但最终它停在了这一点:

PLAY [kube-master] *


TASK [kubespray-defaults : Configure defaults] *
ok: [lsrv-k8s-mstr1] => {
    "msg": "Check roles/kubespray-defaults/defaults/main.yml"
}
ok: [lsrv-k8s-mstr2] => {
    "msg": "Check roles/kubespray-defaults/defaults/main.yml"
}
ok: [lsrv-k8s-mstr3] => {
    "msg": "Check roles/kubespray-defaults/defaults/main.yml"
}
Saturday 25 April 2020  23:09:41 +0430 (0:00:00.044)       0:07:44.209  
Saturday 25 April 2020  23:09:41 +0430 (0:00:00.043)       0:07:44.253  
Saturday 25 April 2020  23:09:41 +0430 (0:00:00.044)       0:07:44.297  
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (20 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (19 retries left).
...
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (2 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (1 retries left).

TASK [kubernetes-apps/ansible : Kubernetes Apps | Wait for kube-apiserver] *
fatal: [lsrv-k8s-mstr1]: FAILED! => {"attempts": 20, "changed": false, "content": "", "elapsed": 0, "msg": "Status code was -1 and not [200]: Request failed: <urlopen error Tunnel connection failed: 403 Forbidden>", "redirected": false, "status": -1, "url": "https://127.0.0.1:6443/healthz"}

NO MORE HOSTS LEFT *

PLAY RECAP *
localhost                  : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
lsrv-k8s-etcd1             : ok=152  changed=8    unreachable=0    failed=0    skipped=213  rescued=0    ignored=0   
lsrv-k8s-etcd2             : ok=142  changed=8    unreachable=0    failed=0    skipped=206  rescued=0    ignored=0   
lsrv-k8s-etcd3             : ok=142  changed=8    unreachable=0    failed=0    skipped=206  rescued=0    ignored=0   
lsrv-k8s-mstr1             : ok=626  changed=48   unreachable=0    failed=1    skipped=747  rescued=0    ignored=0   
lsrv-k8s-mstr2             : ok=464  changed=40   unreachable=0    failed=0    skipped=605  rescued=0    ignored=0   
lsrv-k8s-mstr3             : ok=466  changed=40   unreachable=0    failed=0    skipped=603  rescued=0    ignored=0   
lsrv-k8s-node1             : ok=385  changed=22   unreachable=0    failed=1    skipped=334  rescued=1    ignored=0   
lsrv-k8s-node2             : ok=385  changed=22   unreachable=0    failed=1    skipped=334  rescued=1    ignored=0   

Saturday 25 April 2020  23:10:07 +0430 (0:00:25.764)       0:08:10.061  
=============================================================================== 
kubernetes/kubeadm : Join to cluster - 64.06s
kubernetes/kubeadm : Join to cluster with ignores  63.87s
kubernetes-apps/ansible : Kubernetes Apps | Wait for kube-apiserver  25.76s
kubernetes/preinstall : Update package management cache (APT)  17.29s
etcd : Gen_certs | Write etcd master certs - 11.07s
kubernetes/master : Master | wait for kube-scheduler  7.76s
Gather necessary facts  6.35s
kubernetes-apps/ingress_controller/cert_manager : Cert Manager | Remove legacy namespace  5.64s
container-engine/docker : ensure docker packages are installed  5.14s
kubernetes-apps/ingress_controller/ingress_nginx : NGINX Ingress Controller | Create manifests  4.48s
kubernetes/master : kubeadm | write out kubeadm certs - 4.41s
kubernetes-apps/ingress_controller/cert_manager : Cert Manager | Create manifests - 3.99s
etcd : Gen_certs | Gather etcd master certs - 3.70s
bootstrap-os : Fetch /etc/os-release  3.63s
bootstrap-os : Install dbus for the hostname module - 3.29s
kubernetes-apps/external_provisioner/local_path_provisioner : Local Path Provisioner | Create manifests - 3.11s
kubernetes-apps/ingress_controller/ingress_nginx : NGINX Ingress Controller | Apply manifests - 3.05s
kubernetes/client : Generate admin kubeconfig with external api endpoint  2.70s
kubernetes/master : kubeadm | Check if apiserver.crt contains all needed SANs - 2.68s
download : download | Download files / images - 2.67s

健康检查似乎不起作用。和 return 403:

致命:[lsrv-k8s-mstr1]:失败! => {"attempts": 20, "changed": 假, "content": "", "elapsed": 0, "msg": "Status code was -1 and not [200]: Request failed: ", "redirected":错误,"status":-1,"url":“https://127.0.0.1:6443/healthz”}

请指导我。

您的错误消息表明您的根本问题可能是身份验证问题。确保您没有遗漏或错误配置任何预安装步骤。

这些命令将提供有关集群状态的一些信息:

kubectl 获取组件状态

kubectl 获取节点

kubectl get pods --all-namespaces

问题是由工作节点的 /etc/environment 文件中设置的 https_proxy 引起的。

删除https_proxy和http_proxy行后,问题解决。