kubernetes: api-server 和 controller-manager 无法启动
kubernetes: api-server and controller-manager cant start
我有一个 运行 k8s 集群,使用 kubeadm 设置。
我有问题,由于绑定异常,api-server
和 controller-manager
pod 无法启动:
failed to create listener: failed to listen on 0.0.0.0:6443: listen tcp 0.0.0.0:6443: bind: address already in use
我们最近在所有节点上将 docker-ce
从版本 18.01
降级到 17.09
,因为 docker 在重新创建容器时存在错误。但是在降级集群之后工作正常,这意味着 api-server 和 controller-manager 是 运行。
我搜索了 google 等等,寻找与 api-server 和 controller-manager 的绑定异常相关的问题,但找不到任何有用的东西
我检查过,主节点上的那个端口上没有其他进程 运行。
我尝试过的事情:
- 在 master 上重启 kubelet:
systemctl restart kubelet
- 重新启动 docker 守护程序,监视陈旧的容器:没有找到任何人
- 检查 6443 上是否有任何进程 运行:
lsof -i:6443
不打印任何内容,但 nmap localhost -p 6443
显示端口已打开 service unknown
- 也重新启动了系统 pod
重新启动 kubelet 和 docker 守护进程工作正常,但对问题没有任何影响
Kubeadm / kubectl - 版本:
kubeadm version: &version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", BuildDate:"2018-01-18T09:42:01Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
使用 weave
作为 netcork-cni
编辑:
主节点dockerps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
59239d32b1e4 weaveworks/weave-npc "/usr/bin/weave-npc" About an hour ago Up About an hour k8s_weave-npc_weave-net-74vsh_kube-system_99f6ee35-0f56-11e8-95e1-1614e1ecd749_0
7cb888c1ab4d weaveworks/weave-kube "/home/weave/launc..." About an hour ago Up About an hour k8s_weave_weave-net-74vsh_kube-system_99f6ee35-0f56-11e8-95e1-1614e1ecd749_0
1ad50c15f816 gcr.io/google_containers/pause-amd64:3.0 "/pause" About an hour ago Up About an hour k8s_POD_weave-net-74vsh_kube-system_99f6ee35-0f56-11e8-95e1-1614e1ecd749_0
ecb845f1dfae gcr.io/google_containers/etcd-amd64 "etcd --advertise-..." 2 hours ago Up 2 hours k8s_etcd_etcd-kube01_kube-system_1b6fafb5dc39ea18814d9bc27da851eb_6
001234690d7a gcr.io/google_containers/kube-scheduler-amd64 "kube-scheduler --..." 2 hours ago Up 2 hours k8s_kube-scheduler_kube-scheduler-kube01_kube-system_69c12074e336b0dbbd0a1666ce05226a_3
0ce04f222f08 gcr.io/google_containers/pause-amd64:3.0 "/pause" 2 hours ago Up 2 hours k8s_POD_kube-scheduler-kube01_kube-system_69c12074e336b0dbbd0a1666ce05226a_3
0a3d9eabd961 gcr.io/google_containers/pause-amd64:3.0 "/pause" 2 hours ago Up 2 hours k8s_POD_kube-apiserver-kube01_kube-system_95c67f50e46db081012110e8bcce9dfc_3
c77767104eb9 gcr.io/google_containers/pause-amd64:3.0 "/pause" 2 hours ago Up 2 hours k8s_POD_etcd-kube01_kube-system_1b6fafb5dc39ea18814d9bc27da851eb_4
319873797a8a gcr.io/google_containers/pause-amd64:3.0 "/pause" 2 hours ago Up 2 hours k8s_POD_kube-controller-manager-kube01_kube-system_f64b9b5ba10a00baa5c176d5877e8671_4
journalctl - 完整:
Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.205824 3195 kuberuntime_manager.go:758] checking backoff for container "kube-controller-manager" in pod "kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.205991 3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)
Feb 11 19:51:03 kube01 kubelet[3195]: E0211 19:51:03.206039 3195 pod_workers.go:186] Error syncing pod f64b9b5ba10a00baa5c176d5877e8671 ("kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"), skipping: failed to "StartContainer" for "kube-controller-manager" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.206161 3195 kuberuntime_manager.go:514] Container {Name:kube-apiserver Image:gcr.io/google_containers/kube-apiserver-amd64:v1.9.2 Command:[kube-apiserver --client-ca-file=/etc/kubernetes/pki/ca.crt --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota --allow-privileged=true --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --requestheader-extra-headers-prefix=X-Remote-Extra- --advertise-address=207.154.252.249 --service-cluster-ip-range=10.96.0.0/12 --insecure-port=0 --enable-bootstrap-token-auth=true --requestheader-allowed-names=front-proxy-client --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-username-headers=X-Remote-User --service-account-key-file=/etc/kubernetes/pki/sa.pub --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --secure-port=6443 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-group-headers=X-Remote-Group --tls-private-key-file=/etc/kubernetes/pki/apiserver.key --authorization-mode=Node,RBAC --etcd-servers=http://127.0.0.1:2379] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:250 scale:-3} d:{Dec:<nil>} s:250m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:6443,Host:207.154.252.249,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil Terminat
Feb 11 19:51:03 kube01 kubelet[3195]: ionMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.206234 3195 kuberuntime_manager.go:758] checking backoff for container "kube-apiserver" in pod "kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.206350 3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)
Feb 11 19:51:03 kube01 kubelet[3195]: E0211 19:51:03.206381 3195 pod_workers.go:186] Error syncing pod 95c67f50e46db081012110e8bcce9dfc ("kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:12 kube01 kubelet[3195]: E0211 19:51:12.816797 3195 fs.go:418] Stat fs failed. Error: no such file or directory
Feb 11 19:51:14 kube01 kubelet[3195]: I0211 19:51:14.203327 3195 kuberuntime_manager.go:514] Container {Name:kube-apiserver Image:gcr.io/google_containers/kube-apiserver-amd64:v1.9.2 Command:[kube-apiserver --client-ca-file=/etc/kubernetes/pki/ca.crt --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota --allow-privileged=true --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --requestheader-extra-headers-prefix=X-Remote-Extra- --advertise-address=207.154.252.249 --service-cluster-ip-range=10.96.0.0/12 --insecure-port=0 --enable-bootstrap-token-auth=true --requestheader-allowed-names=front-proxy-client --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-username-headers=X-Remote-User --service-account-key-file=/etc/kubernetes/pki/sa.pub --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --secure-port=6443 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-group-headers=X-Remote-Group --tls-private-key-file=/etc/kubernetes/pki/apiserver.key --authorization-mode=Node,RBAC --etcd-servers=http://127.0.0.1:2379] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:250 scale:-3} d:{Dec:<nil>} s:250m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:6443,Host:207.154.252.249,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil Terminat
Feb 11 19:51:14 kube01 kubelet[3195]: ionMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:14 kube01 kubelet[3195]: I0211 19:51:14.203631 3195 kuberuntime_manager.go:758] checking backoff for container "kube-apiserver" in pod "kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:14 kube01 kubelet[3195]: I0211 19:51:14.203833 3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)
Feb 11 19:51:14 kube01 kubelet[3195]: E0211 19:51:14.203886 3195 pod_workers.go:186] Error syncing pod 95c67f50e46db081012110e8bcce9dfc ("kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:15 kube01 kubelet[3195]: I0211 19:51:15.203837 3195 kuberuntime_manager.go:514] Container {Name:kube-controller-manager Image:gcr.io/google_containers/kube-controller-manager-amd64:v1.9.2 Command:[kube-controller-manager --leader-elect=true --controllers=*,bootstrapsigner,tokencleaner --kubeconfig=/etc/kubernetes/controller-manager.conf --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt --cluster-signing-key-file=/etc/kubernetes/pki/ca.key --address=127.0.0.1 --use-service-account-credentials=true --root-ca-file=/etc/kubernetes/pki/ca.crt --service-account-private-key-file=/etc/kubernetes/pki/sa.key] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:200 scale:-3} d:{Dec:<nil>} s:200m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>} {Name:kubeconfig ReadOnly:true MountPath:/etc/kubernetes/controller-manager.conf SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:10252,Host:127.0.0.1,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:15 kube01 kubelet[3195]: I0211 19:51:15.205830 3195 kuberuntime_manager.go:758] checking backoff for container "kube-controller-manager" in pod "kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:15 kube01 kubelet[3195]: I0211 19:51:15.207429 3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)
Feb 11 19:51:15 kube01 kubelet[3195]: E0211 19:51:15.207813 3195 pod_workers.go:186] Error syncing pod f64b9b5ba10a00baa5c176d5877e8671 ("kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"), skipping: failed to "StartContainer" for "kube-controller-manager" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:26 kube01 kubelet[3195]: I0211 19:51:26.203361 3195 kuberuntime_manager.go:514] Container {Name:kube-apiserver Image:gcr.io/google_containers/kube-apiserver-amd64:v1.9.2 Command:[kube-apiserver --client-ca-file=/etc/kubernetes/pki/ca.crt --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota --allow-privileged=true --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --requestheader-extra-headers-prefix=X-Remote-Extra- --advertise-address=207.154.252.249 --service-cluster-ip-range=10.96.0.0/12 --insecure-port=0 --enable-bootstrap-token-auth=true --requestheader-allowed-names=front-proxy-client --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-username-headers=X-Remote-User --service-account-key-file=/etc/kubernetes/pki/sa.pub --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --secure-port=6443 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-group-headers=X-Remote-Group --tls-private-key-file=/etc/kubernetes/pki/apiserver.key --authorization-mode=Node,RBAC --etcd-servers=http://127.0.0.1:2379] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:250 scale:-3} d:{Dec:<nil>} s:250m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:6443,Host:207.154.252.249,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil Terminat
Feb 11 19:51:26 kube01 kubelet[3195]: ionMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:26 kube01 kubelet[3195]: I0211 19:51:26.205258 3195 kuberuntime_manager.go:758] checking backoff for container "kube-apiserver" in pod "kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:26 kube01 kubelet[3195]: I0211 19:51:26.205670 3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)
Feb 11 19:51:26 kube01 kubelet[3195]: E0211 19:51:26.205965 3195 pod_workers.go:186] Error syncing pod 95c67f50e46db081012110e8bcce9dfc ("kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:29 kube01 kubelet[3195]: I0211 19:51:29.203234 3195 kuberuntime_manager.go:514] Container {Name:kube-controller-manager Image:gcr.io/google_containers/kube-controller-manager-amd64:v1.9.2 Command:[kube-controller-manager --leader-elect=true --controllers=*,bootstrapsigner,tokencleaner --kubeconfig=/etc/kubernetes/controller-manager.conf --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt --cluster-signing-key-file=/etc/kubernetes/pki/ca.key --address=127.0.0.1 --use-service-account-credentials=true --root-ca-file=/etc/kubernetes/pki/ca.crt --service-account-private-key-file=/etc/kubernetes/pki/sa.key] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:200 scale:-3} d:{Dec:<nil>} s:200m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>} {Name:kubeconfig ReadOnly:true MountPath:/etc/kubernetes/controller-manager.conf SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:10252,Host:127.0.0.1,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:29 kube01 kubelet[3195]: I0211 19:51:29.207713 3195 kuberuntime_manager.go:758] checking backoff for container "kube-controller-manager" in pod "kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:29 kube01 kubelet[3195]: I0211 19:51:29.208492 3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)
Feb 11 19:51:29 kube01 kubelet[3195]: E0211 19:51:29.208875 3195 pod_workers.go:186] Error syncing pod f64b9b5ba10a00baa5c176d5877e8671 ("kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"), skipping: failed to "StartContainer" for "kube-controller-manager" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:32 kube01 kubelet[3195]: E0211 19:51:32.369188 3195 fs.go:418] Stat fs failed. Error: no such file or directory
Feb 11 19:51:39 kube01 kubelet[3195]: I0211 19:51:39.203802 3195 kuberuntime_manager.go:514] Container {Name:kube-apiserver Image:gcr.io/google_containers/kube-apiserver-amd64:v1.9.2 Command:[kube-apiserver --client-ca-file=/etc/kubernetes/pki/ca.crt --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota --allow-privileged=true --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --requestheader-extra-headers-prefix=X-Remote-Extra- --advertise-address=207.154.252.249 --service-cluster-ip-range=10.96.0.0/12 --insecure-port=0 --enable-bootstrap-token-auth=true --requestheader-allowed-names=front-proxy-client --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-username-headers=X-Remote-User --service-account-key-file=/etc/kubernetes/pki/sa.pub --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --secure-port=6443 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-group-headers=X-Remote-Group --tls-private-key-file=/etc/kubernetes/pki/apiserver.key --authorization-mode=Node,RBAC --etcd-servers=http://127.0.0.1:2379] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:250 scale:-3} d:{Dec:<nil>} s:250m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:6443,Host:207.154.252.249,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil Terminat
Feb 11 19:51:39 kube01 kubelet[3195]: ionMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:39 kube01 kubelet[3195]: I0211 19:51:39.205508 3195 kuberuntime_manager.go:758] checking backoff for container "kube-apiserver" in pod "kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:39 kube01 kubelet[3195]: I0211 19:51:39.206071 3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)
Feb 11 19:51:39 kube01 kubelet[3195]: E0211 19:51:39.206336 3195 pod_workers.go:186] Error syncing pod 95c67f50e46db081012110e8bcce9dfc ("kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
kubeadm.conf
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt"
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true --cert-dir=/var/lib/kubelet/pki"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS
docker-信息-cgroup
WARNING: No swap limit support
Cgroup Driver: cgroupfs
内核:
Linux kube01 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
分布:
Distributor ID: Ubuntu
Description: Ubuntu 16.04.3 LTS
Release: 16.04
Codename: xenial
问题很简单,某些服务已经绑定到 6443
以检查您是否可以使用 netstat -lutpn | grep 6443
并终止该进程并重新启动 kubelet 服务。
$ netstat -lutpn | grep 6443
tcp6 0 0 :::6443 :::* LISTEN 11395/some-service
$ kill 11395
$ service kubelet restart
这应该可以解决问题。
对于 kubernetes,如果 kubernetes 没有得到适当的休息并且容器没有得到适当的清理,通常会发生这种情况。
这样做...
$ kubeadm reset
$ docker rm -f $(docker ps -a -q)
$ kubeadm init <options> # new initialization
这意味着节点将不得不重新加入。
就我而言,它有助于:
- 在所有节点(主节点和工作节点)上禁用交换
- 重启每个节点
之后一切正常。
我有一个 运行 k8s 集群,使用 kubeadm 设置。
我有问题,由于绑定异常,api-server
和 controller-manager
pod 无法启动:
failed to create listener: failed to listen on 0.0.0.0:6443: listen tcp 0.0.0.0:6443: bind: address already in use
我们最近在所有节点上将 docker-ce
从版本 18.01
降级到 17.09
,因为 docker 在重新创建容器时存在错误。但是在降级集群之后工作正常,这意味着 api-server 和 controller-manager 是 运行。
我搜索了 google 等等,寻找与 api-server 和 controller-manager 的绑定异常相关的问题,但找不到任何有用的东西
我检查过,主节点上的那个端口上没有其他进程 运行。 我尝试过的事情:
- 在 master 上重启 kubelet:
systemctl restart kubelet
- 重新启动 docker 守护程序,监视陈旧的容器:没有找到任何人
- 检查 6443 上是否有任何进程 运行:
lsof -i:6443
不打印任何内容,但nmap localhost -p 6443
显示端口已打开service unknown
- 也重新启动了系统 pod
重新启动 kubelet 和 docker 守护进程工作正常,但对问题没有任何影响
Kubeadm / kubectl - 版本:
kubeadm version: &version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", BuildDate:"2018-01-18T09:42:01Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
使用 weave
作为 netcork-cni
编辑:
主节点dockerps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
59239d32b1e4 weaveworks/weave-npc "/usr/bin/weave-npc" About an hour ago Up About an hour k8s_weave-npc_weave-net-74vsh_kube-system_99f6ee35-0f56-11e8-95e1-1614e1ecd749_0
7cb888c1ab4d weaveworks/weave-kube "/home/weave/launc..." About an hour ago Up About an hour k8s_weave_weave-net-74vsh_kube-system_99f6ee35-0f56-11e8-95e1-1614e1ecd749_0
1ad50c15f816 gcr.io/google_containers/pause-amd64:3.0 "/pause" About an hour ago Up About an hour k8s_POD_weave-net-74vsh_kube-system_99f6ee35-0f56-11e8-95e1-1614e1ecd749_0
ecb845f1dfae gcr.io/google_containers/etcd-amd64 "etcd --advertise-..." 2 hours ago Up 2 hours k8s_etcd_etcd-kube01_kube-system_1b6fafb5dc39ea18814d9bc27da851eb_6
001234690d7a gcr.io/google_containers/kube-scheduler-amd64 "kube-scheduler --..." 2 hours ago Up 2 hours k8s_kube-scheduler_kube-scheduler-kube01_kube-system_69c12074e336b0dbbd0a1666ce05226a_3
0ce04f222f08 gcr.io/google_containers/pause-amd64:3.0 "/pause" 2 hours ago Up 2 hours k8s_POD_kube-scheduler-kube01_kube-system_69c12074e336b0dbbd0a1666ce05226a_3
0a3d9eabd961 gcr.io/google_containers/pause-amd64:3.0 "/pause" 2 hours ago Up 2 hours k8s_POD_kube-apiserver-kube01_kube-system_95c67f50e46db081012110e8bcce9dfc_3
c77767104eb9 gcr.io/google_containers/pause-amd64:3.0 "/pause" 2 hours ago Up 2 hours k8s_POD_etcd-kube01_kube-system_1b6fafb5dc39ea18814d9bc27da851eb_4
319873797a8a gcr.io/google_containers/pause-amd64:3.0 "/pause" 2 hours ago Up 2 hours k8s_POD_kube-controller-manager-kube01_kube-system_f64b9b5ba10a00baa5c176d5877e8671_4
journalctl - 完整:
Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.205824 3195 kuberuntime_manager.go:758] checking backoff for container "kube-controller-manager" in pod "kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.205991 3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)
Feb 11 19:51:03 kube01 kubelet[3195]: E0211 19:51:03.206039 3195 pod_workers.go:186] Error syncing pod f64b9b5ba10a00baa5c176d5877e8671 ("kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"), skipping: failed to "StartContainer" for "kube-controller-manager" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.206161 3195 kuberuntime_manager.go:514] Container {Name:kube-apiserver Image:gcr.io/google_containers/kube-apiserver-amd64:v1.9.2 Command:[kube-apiserver --client-ca-file=/etc/kubernetes/pki/ca.crt --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota --allow-privileged=true --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --requestheader-extra-headers-prefix=X-Remote-Extra- --advertise-address=207.154.252.249 --service-cluster-ip-range=10.96.0.0/12 --insecure-port=0 --enable-bootstrap-token-auth=true --requestheader-allowed-names=front-proxy-client --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-username-headers=X-Remote-User --service-account-key-file=/etc/kubernetes/pki/sa.pub --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --secure-port=6443 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-group-headers=X-Remote-Group --tls-private-key-file=/etc/kubernetes/pki/apiserver.key --authorization-mode=Node,RBAC --etcd-servers=http://127.0.0.1:2379] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:250 scale:-3} d:{Dec:<nil>} s:250m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:6443,Host:207.154.252.249,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil Terminat
Feb 11 19:51:03 kube01 kubelet[3195]: ionMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.206234 3195 kuberuntime_manager.go:758] checking backoff for container "kube-apiserver" in pod "kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:03 kube01 kubelet[3195]: I0211 19:51:03.206350 3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)
Feb 11 19:51:03 kube01 kubelet[3195]: E0211 19:51:03.206381 3195 pod_workers.go:186] Error syncing pod 95c67f50e46db081012110e8bcce9dfc ("kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:12 kube01 kubelet[3195]: E0211 19:51:12.816797 3195 fs.go:418] Stat fs failed. Error: no such file or directory
Feb 11 19:51:14 kube01 kubelet[3195]: I0211 19:51:14.203327 3195 kuberuntime_manager.go:514] Container {Name:kube-apiserver Image:gcr.io/google_containers/kube-apiserver-amd64:v1.9.2 Command:[kube-apiserver --client-ca-file=/etc/kubernetes/pki/ca.crt --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota --allow-privileged=true --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --requestheader-extra-headers-prefix=X-Remote-Extra- --advertise-address=207.154.252.249 --service-cluster-ip-range=10.96.0.0/12 --insecure-port=0 --enable-bootstrap-token-auth=true --requestheader-allowed-names=front-proxy-client --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-username-headers=X-Remote-User --service-account-key-file=/etc/kubernetes/pki/sa.pub --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --secure-port=6443 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-group-headers=X-Remote-Group --tls-private-key-file=/etc/kubernetes/pki/apiserver.key --authorization-mode=Node,RBAC --etcd-servers=http://127.0.0.1:2379] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:250 scale:-3} d:{Dec:<nil>} s:250m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:6443,Host:207.154.252.249,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil Terminat
Feb 11 19:51:14 kube01 kubelet[3195]: ionMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:14 kube01 kubelet[3195]: I0211 19:51:14.203631 3195 kuberuntime_manager.go:758] checking backoff for container "kube-apiserver" in pod "kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:14 kube01 kubelet[3195]: I0211 19:51:14.203833 3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)
Feb 11 19:51:14 kube01 kubelet[3195]: E0211 19:51:14.203886 3195 pod_workers.go:186] Error syncing pod 95c67f50e46db081012110e8bcce9dfc ("kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:15 kube01 kubelet[3195]: I0211 19:51:15.203837 3195 kuberuntime_manager.go:514] Container {Name:kube-controller-manager Image:gcr.io/google_containers/kube-controller-manager-amd64:v1.9.2 Command:[kube-controller-manager --leader-elect=true --controllers=*,bootstrapsigner,tokencleaner --kubeconfig=/etc/kubernetes/controller-manager.conf --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt --cluster-signing-key-file=/etc/kubernetes/pki/ca.key --address=127.0.0.1 --use-service-account-credentials=true --root-ca-file=/etc/kubernetes/pki/ca.crt --service-account-private-key-file=/etc/kubernetes/pki/sa.key] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:200 scale:-3} d:{Dec:<nil>} s:200m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>} {Name:kubeconfig ReadOnly:true MountPath:/etc/kubernetes/controller-manager.conf SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:10252,Host:127.0.0.1,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:15 kube01 kubelet[3195]: I0211 19:51:15.205830 3195 kuberuntime_manager.go:758] checking backoff for container "kube-controller-manager" in pod "kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:15 kube01 kubelet[3195]: I0211 19:51:15.207429 3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)
Feb 11 19:51:15 kube01 kubelet[3195]: E0211 19:51:15.207813 3195 pod_workers.go:186] Error syncing pod f64b9b5ba10a00baa5c176d5877e8671 ("kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"), skipping: failed to "StartContainer" for "kube-controller-manager" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:26 kube01 kubelet[3195]: I0211 19:51:26.203361 3195 kuberuntime_manager.go:514] Container {Name:kube-apiserver Image:gcr.io/google_containers/kube-apiserver-amd64:v1.9.2 Command:[kube-apiserver --client-ca-file=/etc/kubernetes/pki/ca.crt --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota --allow-privileged=true --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --requestheader-extra-headers-prefix=X-Remote-Extra- --advertise-address=207.154.252.249 --service-cluster-ip-range=10.96.0.0/12 --insecure-port=0 --enable-bootstrap-token-auth=true --requestheader-allowed-names=front-proxy-client --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-username-headers=X-Remote-User --service-account-key-file=/etc/kubernetes/pki/sa.pub --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --secure-port=6443 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-group-headers=X-Remote-Group --tls-private-key-file=/etc/kubernetes/pki/apiserver.key --authorization-mode=Node,RBAC --etcd-servers=http://127.0.0.1:2379] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:250 scale:-3} d:{Dec:<nil>} s:250m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:6443,Host:207.154.252.249,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil Terminat
Feb 11 19:51:26 kube01 kubelet[3195]: ionMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:26 kube01 kubelet[3195]: I0211 19:51:26.205258 3195 kuberuntime_manager.go:758] checking backoff for container "kube-apiserver" in pod "kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:26 kube01 kubelet[3195]: I0211 19:51:26.205670 3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)
Feb 11 19:51:26 kube01 kubelet[3195]: E0211 19:51:26.205965 3195 pod_workers.go:186] Error syncing pod 95c67f50e46db081012110e8bcce9dfc ("kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:29 kube01 kubelet[3195]: I0211 19:51:29.203234 3195 kuberuntime_manager.go:514] Container {Name:kube-controller-manager Image:gcr.io/google_containers/kube-controller-manager-amd64:v1.9.2 Command:[kube-controller-manager --leader-elect=true --controllers=*,bootstrapsigner,tokencleaner --kubeconfig=/etc/kubernetes/controller-manager.conf --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt --cluster-signing-key-file=/etc/kubernetes/pki/ca.key --address=127.0.0.1 --use-service-account-credentials=true --root-ca-file=/etc/kubernetes/pki/ca.crt --service-account-private-key-file=/etc/kubernetes/pki/sa.key] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:200 scale:-3} d:{Dec:<nil>} s:200m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>} {Name:kubeconfig ReadOnly:true MountPath:/etc/kubernetes/controller-manager.conf SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:10252,Host:127.0.0.1,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:29 kube01 kubelet[3195]: I0211 19:51:29.207713 3195 kuberuntime_manager.go:758] checking backoff for container "kube-controller-manager" in pod "kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:29 kube01 kubelet[3195]: I0211 19:51:29.208492 3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)
Feb 11 19:51:29 kube01 kubelet[3195]: E0211 19:51:29.208875 3195 pod_workers.go:186] Error syncing pod f64b9b5ba10a00baa5c176d5877e8671 ("kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"), skipping: failed to "StartContainer" for "kube-controller-manager" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-controller-manager pod=kube-controller-manager-kube01_kube-system(f64b9b5ba10a00baa5c176d5877e8671)"
Feb 11 19:51:32 kube01 kubelet[3195]: E0211 19:51:32.369188 3195 fs.go:418] Stat fs failed. Error: no such file or directory
Feb 11 19:51:39 kube01 kubelet[3195]: I0211 19:51:39.203802 3195 kuberuntime_manager.go:514] Container {Name:kube-apiserver Image:gcr.io/google_containers/kube-apiserver-amd64:v1.9.2 Command:[kube-apiserver --client-ca-file=/etc/kubernetes/pki/ca.crt --admission-control=Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota --allow-privileged=true --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --requestheader-extra-headers-prefix=X-Remote-Extra- --advertise-address=207.154.252.249 --service-cluster-ip-range=10.96.0.0/12 --insecure-port=0 --enable-bootstrap-token-auth=true --requestheader-allowed-names=front-proxy-client --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-username-headers=X-Remote-User --service-account-key-file=/etc/kubernetes/pki/sa.pub --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --secure-port=6443 --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-group-headers=X-Remote-Group --tls-private-key-file=/etc/kubernetes/pki/apiserver.key --authorization-mode=Node,RBAC --etcd-servers=http://127.0.0.1:2379] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:250 scale:-3} d:{Dec:<nil>} s:250m Format:DecimalSI}]} VolumeMounts:[{Name:k8s-certs ReadOnly:true MountPath:/etc/kubernetes/pki SubPath: MountPropagation:<nil>} {Name:ca-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthz,Port:6443,Host:207.154.252.249,Scheme:HTTPS,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:15,TimeoutSeconds:15,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:8,} ReadinessProbe:nil Lifecycle:nil Terminat
Feb 11 19:51:39 kube01 kubelet[3195]: ionMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Feb 11 19:51:39 kube01 kubelet[3195]: I0211 19:51:39.205508 3195 kuberuntime_manager.go:758] checking backoff for container "kube-apiserver" in pod "kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
Feb 11 19:51:39 kube01 kubelet[3195]: I0211 19:51:39.206071 3195 kuberuntime_manager.go:768] Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)
Feb 11 19:51:39 kube01 kubelet[3195]: E0211 19:51:39.206336 3195 pod_workers.go:186] Error syncing pod 95c67f50e46db081012110e8bcce9dfc ("kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-kube01_kube-system(95c67f50e46db081012110e8bcce9dfc)"
kubeadm.conf
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.96.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt"
Environment="KUBELET_CADVISOR_ARGS=--cadvisor-port=0"
Environment="KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true --cert-dir=/var/lib/kubelet/pki"
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS
docker-信息-cgroup
WARNING: No swap limit support
Cgroup Driver: cgroupfs
内核:
Linux kube01 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
分布:
Distributor ID: Ubuntu
Description: Ubuntu 16.04.3 LTS
Release: 16.04
Codename: xenial
问题很简单,某些服务已经绑定到 6443
以检查您是否可以使用 netstat -lutpn | grep 6443
并终止该进程并重新启动 kubelet 服务。
$ netstat -lutpn | grep 6443
tcp6 0 0 :::6443 :::* LISTEN 11395/some-service
$ kill 11395
$ service kubelet restart
这应该可以解决问题。
对于 kubernetes,如果 kubernetes 没有得到适当的休息并且容器没有得到适当的清理,通常会发生这种情况。
这样做...
$ kubeadm reset
$ docker rm -f $(docker ps -a -q)
$ kubeadm init <options> # new initialization
这意味着节点将不得不重新加入。
就我而言,它有助于:
- 在所有节点(主节点和工作节点)上禁用交换
- 重启每个节点
之后一切正常。