kubernetes api 服务器在主机重启后不会自动启动

kubernetes api server not automatically start after master reboots

我已经用 kubeadm 设置了一个小型集群,它工作正常并且 6443 端口已启动。但是在重新启动我的系统后,集群不再启动了。

我该怎么办?

这里有一些信息:

systemctl status kubelet

● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
        └─10-kubeadm.conf
Active: active (running) since Sun 2020-04-05 14:16:44 UTC; 6s ago
  Docs: https://kubernetes.io/docs/home/
  Main PID: 31079 (kubelet)
 Tasks: 20 (limit: 4915)
CGroup: /system.slice/kubelet.service
        └─31079 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet

 k8s.io/kubernetes/pkg/kubelet/kubelet.go:458: Failed to list *v1.Node: Get https://infra01.mydomainname.com:6443/api/v1/nodes?fieldSelector=metadata.name%3Dtest-infra01&limit=500&resourceVersion=0: dial tcp 116.66.187.210:6443: connect: connection refused

kubectl 获取节点

The connection to the server infra01.mydomainname.com:6443 was refused - did you specify the right host or port?

kubeadm 版本

kubeadm version: &version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:12:12Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}

journalctl -xeu kubelet

 6   18167 reflector.go:153] k8s.io/kubernetes/pkg/kubelet/kubelet.go:458: 
           Failed to list *v1.Node: Get https://infra01.mydomainname.com
 1   18167 reflector.go:153] 
           k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://huawei-infra01.s
 4   18167 aws_credentials.go:77] while getting AWS credentials 
           NoCredentialProviders: no valid providers in chain. Deprecated.
           messaging see aws.Config.CredentialsChainVerboseErrors
 6   18167 kuberuntime_manager.go:211] Container runtime docker initialized, 
           version: 19.03.7, apiVersion: 1.40.0
 6   18167 server.go:1113] Started kubelet
 1   18167 kubelet.go:1302] Image garbage collection failed once. Stats 
           initialization may not have completed yet: failed to get imageF
 8   18167 server.go:144] Starting to listen on 0.0.0.0:10250
 4   18167 server.go:778] Starting healthz server failed: listen tcp 
           127.0.0.1:10248: bind: address already in use
 5   18167 fs_resource_analyzer.go:64] Starting FS ResourceAnalyzer
 4   18167 volume_manager.go:265] Starting Kubelet Volume Manager
 1   18167 desired_state_of_world_populator.go:138] Desired state populator 
           starts to run
 3   18167 server.go:384] Adding debug handlers to kubelet server.
 4   18167 server.go:158] listen tcp 0.0.0.0:10250: bind: address already in 
           use

Docker

docker run hello-world
Hello from Docker!

ubuntu

lsb_release -a
Ubuntu 18.04.2 LTS

交换 && kubeconfig

swap is turned off and kubeconfig was correctly exported

备注
可以通过重置集群来解决问题,但这应该是最后的选择。

Kubelet 未启动,因为端口已被使用,因此无法为 api 服务器创建 pod。 使用以下命令找出哪个进程占用端口 10250

root@master admin]# ss -lntp | grep 10250
LISTEN     0      128         :::10250                   :::*                   users:(("kubelet",pid=23373,fd=20))

它将为您提供该进程的 PID 和该进程的名称。如果占用该端口的是不需要的进程,您可以随时终止该进程,然后该端口可供 kubelet 使用。

再次杀死进程后运行上面的命令,应该return没有价值。

为了安全起见 运行 kubeadm reset 然后 运行 kubeadm init 它应该通过

编辑:

使用 snap stop kubelet 完成了在节点上停止 kubelet 的技巧。