无法安装第三个 kubernetes 主节点:kubeadm join 中的 Kubelet TLS 引导超时
Can't install third kubernetes master node: Kubelet TLS bootstrapping timeout in kubeadm join
当尝试使用外部 etcd 在 Kubernetes 1.12 中设置 HA 集群时,我在使用以下命令时遇到超时:
kubeadm join <load balancer>:443 --token <token> --discovery-token-ca-cert-hash sha256:3dfa042fcc28a26da9335c14802718bbc36b82bb71b4e5dfaa70c004454932da --experimental-control-plane
输出:
[preflight] running pre-flight checks
[discovery] Trying to connect to API Server "<load balancer>:443"
[discovery] Created cluster-info discovery client, requesting info from "https://<load balancer>:443"
[discovery] Requesting info from "https://<load balancer>:443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "<load balancer>:443"
[discovery] Successfully established connection with API Server "<load balancer>:443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
I1005 12:48:29.896403 8131 join.go:334] [join] running pre-flight checks before initializing the new control plane instance
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[certificates] Using the existing apiserver certificate and key.
[certificates] Using the existing apiserver-kubelet-client certificate and key.
[certificates] Using the existing front-proxy-client certificate and key.
[certificates] valid certificates and keys now exist in "/etc/kubernetes/pki"
[certificates] Using the existing sa key.
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.12" ConfigMap in the kube-system namespace
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[preflight] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
timed out waiting for the condition
在出现此错误之前已成功安装两个主节点。
我用这个作为安装指南:
https://kubernetes.io/docs/setup/independent/high-availability/#set-up-the-cluster
我的负载均衡器 运行 在我尝试安装集群的同一节点上,但我不明白为什么这可能是个问题(也许是?)。
kubelet 日志没有显示任何重要信息:
kubelet[26132]: I1005 09:34:32.667360 26132 server.go:408] Version: v1.12.0
kubelet[26132]: I1005 09:34:32.667520 26132 plugins.go:99] No cloud provider specified.
kubelet[26132]: W1005 09:34:32.667553 26132 server.go:553] standalone mode, no API client
kubelet[26132]: W1005 09:34:32.745120 26132 server.go:465] No api server defined - no events will be sent to API server.
kubelet[26132]: I1005 09:34:32.745178 26132 server.go:667] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /
kubelet[26132]: I1005 09:34:32.745944 26132 container_manager_linux.go:247] container manager verified user specified cgroup-root exists: []
kubelet[26132]: I1005 09:34:32.745974 26132 container_manager_linux.go:252] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: En
kubelet[26132]: I1005 09:34:32.746237 26132 container_manager_linux.go:271] Creating device plugin manager: true
kubelet[26132]: I1005 09:34:32.746368 26132 state_mem.go:36] [cpumanager] initializing new in-memory state store
kubelet[26132]: I1005 09:34:32.747800 26132 kubelet.go:279] Adding pod path: /etc/kubernetes/manifests
kubelet[26132]: I1005 09:34:32.752107 26132 client.go:75] Connecting to docker on unix:///var/run/docker.sock
kubelet[26132]: I1005 09:34:32.752172 26132 client.go:104] Start docker client with request timeout=2m0s
kubelet[26132]: W1005 09:34:32.754889 26132 docker_service.go:540] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
kubelet[26132]: I1005 09:34:32.754954 26132 docker_service.go:236] Hairpin mode set to "hairpin-veth"
kubelet[26132]: W1005 09:34:32.755195 26132 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
kubelet[26132]: W1005 09:34:32.759325 26132 hostport_manager.go:68] The binary conntrack is not installed, this can cause failures in network connection cleanup.
kubelet[26132]: I1005 09:34:32.762094 26132 docker_service.go:251] Docker cri networking managed by kubernetes.io/no-op
kubelet[26132]: I1005 09:34:32.789329 26132 docker_service.go:256] Docker Info: &{ID:LJUT:6WWB:WNW2:UJGM:R5HT:4POO:QL2M:PFOI:OKZN:OBP2:ODQS:SSJU Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:19 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type true] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan nul
kubelet[26132]: I1005 09:34:32.789503 26132 docker_service.go:269] Setting cgroupDriver to cgroupfs
kubelet[26132]: I1005 09:34:32.820067 26132 kuberuntime_manager.go:197] Container runtime docker initialized, version: 17.06.2-ce, apiVersion: 1.30.0
kubelet[26132]: I1005 09:34:32.822547 26132 server.go:1013] Started kubelet
kubelet[26132]: W1005 09:34:32.822599 26132 kubelet.go:1387] No api server defined - no node status update will be sent.
kubelet[26132]: E1005 09:34:32.822622 26132 kubelet.go:1287] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data in memory cache
kubelet[26132]: I1005 09:34:32.822624 26132 server.go:133] Starting to listen on 127.0.0.1:10250
kubelet[26132]: I1005 09:34:32.823855 26132 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
kubelet[26132]: I1005 09:34:32.823900 26132 status_manager.go:148] Kubernetes client is nil, not starting status manager.
kubelet[26132]: I1005 09:34:32.823919 26132 kubelet.go:1804] Starting kubelet main sync loop.
kubelet[26132]: I1005 09:34:32.823971 26132 kubelet.go:1821] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s]
kubelet[26132]: I1005 09:34:32.824016 26132 volume_manager.go:248] Starting Kubelet Volume Manager
kubelet[26132]: I1005 09:34:32.824094 26132 desired_state_of_world_populator.go:130] Desired state populator starts to run
kubelet[26132]: I1005 09:34:32.824656 26132 server.go:318] Adding debug handlers to kubelet server.
kubelet[26132]: I1005 09:34:32.924253 26132 kubelet.go:1821] skipping pod synchronization - [container runtime is down]
kubelet[26132]: I1005 09:34:33.072557 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.077937 26132 cpu_manager.go:155] [cpumanager] starting with none policy
kubelet[26132]: I1005 09:34:33.077967 26132 cpu_manager.go:156] [cpumanager] reconciling every 10s
kubelet[26132]: I1005 09:34:33.077976 26132 policy_none.go:42] [cpumanager] none policy: Start
kubelet[26132]: W1005 09:34:33.078616 26132 manager.go:527] Failed to retrieve checkpoint for "kubelet_internal_checkpoint": checkpoint is not found
kubelet[26132]: I1005 09:34:33.078989 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.124726 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.130955 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.136320 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.136580 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.142780 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.143667 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.224945 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "ca-certs" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-ca-certs") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
kubelet[26132]: I1005 09:34:33.225058 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "etcd-certs-0" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-etcd-certs-0") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
kubelet[26132]: I1005 09:34:33.225200 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "etc-pki" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-etc-pki") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
kubelet[26132]: I1005 09:34:33.325745 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "flexvolume-dir" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-flexvolume-dir") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
kubelet[26132]: I1005 09:34:33.325834 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "etc-pki" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-etc-pki") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
kubelet[26132]: I1005 09:34:33.325890 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kubeconfig" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-kubeconfig") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
kubelet[26132]: I1005 09:34:33.326047 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kubeconfig" (UniqueName: "kubernetes.io/host-path/dd3b0cd7d636afb2b116453dc6524f26-kubeconfig") pod "kube-scheduler-" (UID: "dd3b0cd7d636afb2b116453dc6524f26")
kubelet[26132]: I1005 09:34:33.326393 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "k8s-certs" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-k8s-certs") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
kubelet[26132]: I1005 09:34:33.326524 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "k8s-certs" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-k8s-certs") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
kubelet[26132]: I1005 09:34:33.326645 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "ca-certs" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-ca-certs") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
kubelet[26132]: I1005 09:34:33.326693 26132 reconciler.go:154] Reconciler: start to sync state
dockerd[24966]: time="2018-10-05T09:34:33.789690025+02:00" level=warning msg="Unknown healthcheck type 'NONE' (expected 'CMD') in container 40806fa9041d3a65d39fdc1a68e2415f0d77f84e0c4f8c163d3bd48fec0d763f"
kubelet[26132]: W1005 09:34:33.792727 26132 docker_container.go:202] Deleted previously existing symlink file: "/var/log/pods/92f250670b6bc27fc8b90703d1196aa3/kube-controller-manager/0.log"
dockerd[24966]: time="2018-10-05T09:34:33.820145872+02:00" level=warning msg="Unknown healthcheck type 'NONE' (expected 'CMD') in container 19328df83a640d71faf86310d1a4052f3af42e75513d9745a2775532803ba122"
kubelet[26132]: W1005 09:34:33.822612 26132 docker_container.go:202] Deleted previously existing symlink file: "/var/log/pods/dd3b0cd7d636afb2b116453dc6524f26/kube-scheduler/0.log"
dockerd[24966]: time="2018-10-05T09:34:33.836511632+02:00" level=warning msg="Unknown healthcheck type 'NONE' (expected 'CMD') in container 6b9e3036a5027b42a4340ad0779be6030593d1a10df4367c0a0ca54ff1345f16"
kubelet[26132]: I1005 09:34:33.851661 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.865408 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.874766 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: W1005 09:34:34.841803 26132 kubelet_getters.go:264] Path "/var/lib/kubelet/pods/43fc349d-c86e-11e8-a0aa-001018759bc8/volumes" does not exist
kubelet[26132]: W1005 09:34:34.841888 26132 kubelet_getters.go:264] Path "/var/lib/kubelet/pods/7c7d1db45cb11bf12de2eac803da8b77/volumes" does not exist
kubelet[26132]: W1005 09:34:34.841935 26132 kubelet_getters.go:264] Path "/var/lib/kubelet/pods/43fbcf1b-c86e-11e8-a0aa-001018759bc8/volumes" does not exist
kubelet[26132]: I1005 09:34:34.880168 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:34.880564 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:34.880645 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:43.121992 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:53.165661 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
sshd[26621]: Connection closed by 172.29.2.56 port 50080 [preauth]
kubelet[26132]: I1005 09:35:03.210021 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:35:13.252179 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:35:23.295605 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
有什么想法吗?
编辑:
比较我发现的节点上的kubelet,那个kubelet在其他两个节点上是这样启动的:
kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni
TLS 超时后,我在第三个节点上使用此命令导致:
I1005 .008343 server.go:408] Version: v1.12.0
I1005 .008857 plugins.go:99] No cloud provider specified.
I1005 .045644 certificate_store.go:131] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
I1005 .134861 server.go:667] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /
I1005 .135501 container_manager_linux.go:247] container manager verified user specified cgroup-root exists: []
I1005 .135551 container_manager_linux.go:252] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>} {Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerReconcilePeriod:10s ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms}
I1005 .135777 container_manager_linux.go:271] Creating device plugin manager: true
I1005 .135829 state_mem.go:36] [cpumanager] initializing new in-memory state store
I1005 .136055 state_mem.go:84] [cpumanager] updated default cpuset: ""
I1005 .136084 state_mem.go:92] [cpumanager] updated cpuset assignments: "map[]"
I1005 .136410 kubelet.go:279] Adding pod path: /etc/kubernetes/manifests
I1005 .136461 kubelet.go:304] Watching apiserver
I1005 .141009 client.go:75] Connecting to docker on unix:///var/run/docker.sock
I1005 .141054 client.go:104] Start docker client with request timeout=2m0s
W1005 .143351 docker_service.go:540] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
I1005 .143395 docker_service.go:236] Hairpin mode set to "hairpin-veth"
W1005 .143618 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
W1005 .147722 hostport_manager.go:68] The binary conntrack is not installed, this can cause failures in network connection cleanup.
W1005 .147880 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
I1005 .147944 docker_service.go:251] Docker cri networking managed by cni
I1005 .177322 docker_service.go:256] Docker Info: &{ID:LJUT:6WWB:WNW2:UJGM:R5HT:4POO:QL2M:PFOI:OKZN:OBP2:ODQS:SSJU Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:19 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type true] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:18 OomKillDisable:true NGoroutines:27 SystemTime:2018-10-05T .158551524+02:00 LoggingDriver:json-file CgroupDriver:cgroupfs NEventsListener:0 KernelVersion:4.18.5-1.el7.elrepo.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc4201e65b0 NCPU:40 MemTotal:134664974336 GenericResources:[] DockerRootDir:/export/data/docker HTTPProxy: HTTPSProxy: NoProxy: Name:dax Labels:[] ExperimentalBuild:false ServerVersion:17.06.2-ce ClusterStore: ClusterAdvertise: Runtimes:map[runc:{Path:docker-runc Args:[]}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil>} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID:6e23458c129b551d5c9871e5174f6b1b7f6d1170 Expected:6e23458c129b551d5c9871e5174f6b1b7f6d1170} RuncCommit:{ID:810190ceaa507aa2727d7ae6f4790c76ec150bd2 Expected:810190ceaa507aa2727d7ae6f4790c76ec150bd2} InitCommit:{ID:949e6fa Expected:949e6fa} SecurityOptions:[name=seccomp,profile=default]}
I1005 .177565 docker_service.go:269] Setting cgroupDriver to cgroupfs
I1005 .211074 kuberuntime_manager.go:197] Container runtime docker initialized, version: 17.06.2-ce, apiVersion: 1.30.0
I1005 .213560 server.go:1013] Started kubelet
E1005 .213611 kubelet.go:1287] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data in memory cache
I1005 .213712 server.go:133] Starting to listen on 0.0.0.0:10250
I1005 .216143 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
I1005 .216334 status_manager.go:152] Starting to sync pod status with apiserver
I1005 .216447 kubelet.go:1804] Starting kubelet main sync loop.
I1005 .216962 kubelet.go:1821] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s]
I1005 .218285 volume_manager.go:248] Starting Kubelet Volume Manager
I1005 .218904 desired_state_of_world_populator.go:130] Desired state populator starts to run
I1005 .220387 server.go:318] Adding debug handlers to kubelet server.
W1005 .221605 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
E1005 .221954 kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
E1005 .317227 kubelet.go:2236] node "dax" not found
I1005 .317229 kubelet.go:1821] skipping pod synchronization - [container runtime is down]
I1005 .318558 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
I1005 .323926 kubelet_node_status.go:70] Attempting to register node dax
I1005 .332022 kubelet_node_status.go:73] Successfully registered node dax
I1005 .417546 kuberuntime_manager.go:910] updating runtime config through cri with podcidr 10.244.3.0/24
I1005 .418060 docker_service.go:345] docker cri received runtime config &RuntimeConfig{NetworkConfig:&NetworkConfig{PodCidr:10.244.3.0/24,},}
I1005 .418505 kubelet_network.go:75] Setting Pod CIDR: -> 10.244.3.0/24
I1005 .465985 cpu_manager.go:155] [cpumanager] starting with none policy
I1005 .466004 cpu_manager.go:156] [cpumanager] reconciling every 10s
I1005 .466012 policy_none.go:42] [cpumanager] none policy: Start
W1005 .466606 manager.go:527] Failed to retrieve checkpoint for "kubelet_internal_checkpoint": checkpoint is not found
W1005 .467018 container_manager_linux.go:803] CPUAccounting not enabled for pid:
W1005 .467029 container_manager_linux.go:806] MemoryAccounting not enabled for pid:
W1005 .467770 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
E1005 .467952 kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
I1005 .520111 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "lib-modules" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-lib-modules") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .520186 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "var-run-calico" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-var-run-calico") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .520296 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "run" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-run") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .520485 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "cni-net-dir" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-cni-net-dir") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .520581 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kube-proxy" (UniqueName: "kubernetes.io/configmap/dde74f33-c893-11e8-a0aa-001018759bc8-kube-proxy") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
I1005 .520641 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "lib-modules" (UniqueName: "kubernetes.io/host-path/dde74f33-c893-11e8-a0aa-001018759bc8-lib-modules") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
I1005 .520697 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "var-lib-calico" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-var-lib-calico") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .520755 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "flannel-cfg" (UniqueName: "kubernetes.io/configmap/dde7c5af-c893-11e8-a0aa-001018759bc8-flannel-cfg") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .520855 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "cni-bin-dir" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-cni-bin-dir") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .520952 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "canal-token-nsdwz" (UniqueName: "kubernetes.io/secret/dde7c5af-c893-11e8-a0aa-001018759bc8-canal-token-nsdwz") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .521094 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "xtables-lock" (UniqueName: "kubernetes.io/host-path/dde74f33-c893-11e8-a0aa-001018759bc8-xtables-lock") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
I1005 .521160 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kube-proxy-token-zjtdh" (UniqueName: "kubernetes.io/secret/dde74f33-c893-11e8-a0aa-001018759bc8-kube-proxy-token-zjtdh") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
I1005 .521232 reconciler.go:154] Reconciler: start to sync state
E1005 .537905 summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
E1005 .574965 summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
E1005 .613275 summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
E1005 .656607 summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
我自己找到了解决方案 - /etc/systemd/system/kubelet.service.d 中的配置文件使用了错误的启动参数 - 我更改了它们并解决了我的问题
文件 20-etcd-service-manager.conf 包含值
ExecStart=/usr/bin/kubelet --address=127.0.0.1
--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true
引起了我的问题。我改成了
ExecStart=/usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni
因为这些是我其他节点的参数。
删除文件可能会更好,这样它就不会覆盖任何其他设置
非常感谢您添加解决方案!这就是我在我的案例中所做的原因:
- 卸载并清除 kubelet、kubeadm 和 kubectl。
- 清除/etc/systemd/system/kubelnet.service.d
- 重新安装并重试。
在 Ubuntu 上:
apt-get remove --purge kubelet kubeadm kubectl
rm -rf /etc/systemd/system/kubelnet.service.d
apt-get install kubelet kubeadm kubectl
kubeadm join ...
当尝试使用外部 etcd 在 Kubernetes 1.12 中设置 HA 集群时,我在使用以下命令时遇到超时:
kubeadm join <load balancer>:443 --token <token> --discovery-token-ca-cert-hash sha256:3dfa042fcc28a26da9335c14802718bbc36b82bb71b4e5dfaa70c004454932da --experimental-control-plane
输出:
[preflight] running pre-flight checks
[discovery] Trying to connect to API Server "<load balancer>:443"
[discovery] Created cluster-info discovery client, requesting info from "https://<load balancer>:443"
[discovery] Requesting info from "https://<load balancer>:443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "<load balancer>:443"
[discovery] Successfully established connection with API Server "<load balancer>:443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
I1005 12:48:29.896403 8131 join.go:334] [join] running pre-flight checks before initializing the new control plane instance
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[certificates] Using the existing apiserver certificate and key.
[certificates] Using the existing apiserver-kubelet-client certificate and key.
[certificates] Using the existing front-proxy-client certificate and key.
[certificates] valid certificates and keys now exist in "/etc/kubernetes/pki"
[certificates] Using the existing sa key.
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.12" ConfigMap in the kube-system namespace
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[preflight] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
timed out waiting for the condition
在出现此错误之前已成功安装两个主节点。 我用这个作为安装指南: https://kubernetes.io/docs/setup/independent/high-availability/#set-up-the-cluster
我的负载均衡器 运行 在我尝试安装集群的同一节点上,但我不明白为什么这可能是个问题(也许是?)。
kubelet 日志没有显示任何重要信息:
kubelet[26132]: I1005 09:34:32.667360 26132 server.go:408] Version: v1.12.0
kubelet[26132]: I1005 09:34:32.667520 26132 plugins.go:99] No cloud provider specified.
kubelet[26132]: W1005 09:34:32.667553 26132 server.go:553] standalone mode, no API client
kubelet[26132]: W1005 09:34:32.745120 26132 server.go:465] No api server defined - no events will be sent to API server.
kubelet[26132]: I1005 09:34:32.745178 26132 server.go:667] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /
kubelet[26132]: I1005 09:34:32.745944 26132 container_manager_linux.go:247] container manager verified user specified cgroup-root exists: []
kubelet[26132]: I1005 09:34:32.745974 26132 container_manager_linux.go:252] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: En
kubelet[26132]: I1005 09:34:32.746237 26132 container_manager_linux.go:271] Creating device plugin manager: true
kubelet[26132]: I1005 09:34:32.746368 26132 state_mem.go:36] [cpumanager] initializing new in-memory state store
kubelet[26132]: I1005 09:34:32.747800 26132 kubelet.go:279] Adding pod path: /etc/kubernetes/manifests
kubelet[26132]: I1005 09:34:32.752107 26132 client.go:75] Connecting to docker on unix:///var/run/docker.sock
kubelet[26132]: I1005 09:34:32.752172 26132 client.go:104] Start docker client with request timeout=2m0s
kubelet[26132]: W1005 09:34:32.754889 26132 docker_service.go:540] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
kubelet[26132]: I1005 09:34:32.754954 26132 docker_service.go:236] Hairpin mode set to "hairpin-veth"
kubelet[26132]: W1005 09:34:32.755195 26132 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
kubelet[26132]: W1005 09:34:32.759325 26132 hostport_manager.go:68] The binary conntrack is not installed, this can cause failures in network connection cleanup.
kubelet[26132]: I1005 09:34:32.762094 26132 docker_service.go:251] Docker cri networking managed by kubernetes.io/no-op
kubelet[26132]: I1005 09:34:32.789329 26132 docker_service.go:256] Docker Info: &{ID:LJUT:6WWB:WNW2:UJGM:R5HT:4POO:QL2M:PFOI:OKZN:OBP2:ODQS:SSJU Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:19 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type true] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan nul
kubelet[26132]: I1005 09:34:32.789503 26132 docker_service.go:269] Setting cgroupDriver to cgroupfs
kubelet[26132]: I1005 09:34:32.820067 26132 kuberuntime_manager.go:197] Container runtime docker initialized, version: 17.06.2-ce, apiVersion: 1.30.0
kubelet[26132]: I1005 09:34:32.822547 26132 server.go:1013] Started kubelet
kubelet[26132]: W1005 09:34:32.822599 26132 kubelet.go:1387] No api server defined - no node status update will be sent.
kubelet[26132]: E1005 09:34:32.822622 26132 kubelet.go:1287] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data in memory cache
kubelet[26132]: I1005 09:34:32.822624 26132 server.go:133] Starting to listen on 127.0.0.1:10250
kubelet[26132]: I1005 09:34:32.823855 26132 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
kubelet[26132]: I1005 09:34:32.823900 26132 status_manager.go:148] Kubernetes client is nil, not starting status manager.
kubelet[26132]: I1005 09:34:32.823919 26132 kubelet.go:1804] Starting kubelet main sync loop.
kubelet[26132]: I1005 09:34:32.823971 26132 kubelet.go:1821] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s]
kubelet[26132]: I1005 09:34:32.824016 26132 volume_manager.go:248] Starting Kubelet Volume Manager
kubelet[26132]: I1005 09:34:32.824094 26132 desired_state_of_world_populator.go:130] Desired state populator starts to run
kubelet[26132]: I1005 09:34:32.824656 26132 server.go:318] Adding debug handlers to kubelet server.
kubelet[26132]: I1005 09:34:32.924253 26132 kubelet.go:1821] skipping pod synchronization - [container runtime is down]
kubelet[26132]: I1005 09:34:33.072557 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.077937 26132 cpu_manager.go:155] [cpumanager] starting with none policy
kubelet[26132]: I1005 09:34:33.077967 26132 cpu_manager.go:156] [cpumanager] reconciling every 10s
kubelet[26132]: I1005 09:34:33.077976 26132 policy_none.go:42] [cpumanager] none policy: Start
kubelet[26132]: W1005 09:34:33.078616 26132 manager.go:527] Failed to retrieve checkpoint for "kubelet_internal_checkpoint": checkpoint is not found
kubelet[26132]: I1005 09:34:33.078989 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.124726 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.130955 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.136320 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.136580 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.142780 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.143667 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.224945 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "ca-certs" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-ca-certs") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
kubelet[26132]: I1005 09:34:33.225058 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "etcd-certs-0" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-etcd-certs-0") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
kubelet[26132]: I1005 09:34:33.225200 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "etc-pki" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-etc-pki") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
kubelet[26132]: I1005 09:34:33.325745 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "flexvolume-dir" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-flexvolume-dir") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
kubelet[26132]: I1005 09:34:33.325834 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "etc-pki" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-etc-pki") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
kubelet[26132]: I1005 09:34:33.325890 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kubeconfig" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-kubeconfig") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
kubelet[26132]: I1005 09:34:33.326047 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kubeconfig" (UniqueName: "kubernetes.io/host-path/dd3b0cd7d636afb2b116453dc6524f26-kubeconfig") pod "kube-scheduler-" (UID: "dd3b0cd7d636afb2b116453dc6524f26")
kubelet[26132]: I1005 09:34:33.326393 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "k8s-certs" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-k8s-certs") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
kubelet[26132]: I1005 09:34:33.326524 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "k8s-certs" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-k8s-certs") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
kubelet[26132]: I1005 09:34:33.326645 26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "ca-certs" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-ca-certs") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
kubelet[26132]: I1005 09:34:33.326693 26132 reconciler.go:154] Reconciler: start to sync state
dockerd[24966]: time="2018-10-05T09:34:33.789690025+02:00" level=warning msg="Unknown healthcheck type 'NONE' (expected 'CMD') in container 40806fa9041d3a65d39fdc1a68e2415f0d77f84e0c4f8c163d3bd48fec0d763f"
kubelet[26132]: W1005 09:34:33.792727 26132 docker_container.go:202] Deleted previously existing symlink file: "/var/log/pods/92f250670b6bc27fc8b90703d1196aa3/kube-controller-manager/0.log"
dockerd[24966]: time="2018-10-05T09:34:33.820145872+02:00" level=warning msg="Unknown healthcheck type 'NONE' (expected 'CMD') in container 19328df83a640d71faf86310d1a4052f3af42e75513d9745a2775532803ba122"
kubelet[26132]: W1005 09:34:33.822612 26132 docker_container.go:202] Deleted previously existing symlink file: "/var/log/pods/dd3b0cd7d636afb2b116453dc6524f26/kube-scheduler/0.log"
dockerd[24966]: time="2018-10-05T09:34:33.836511632+02:00" level=warning msg="Unknown healthcheck type 'NONE' (expected 'CMD') in container 6b9e3036a5027b42a4340ad0779be6030593d1a10df4367c0a0ca54ff1345f16"
kubelet[26132]: I1005 09:34:33.851661 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.865408 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:33.874766 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: W1005 09:34:34.841803 26132 kubelet_getters.go:264] Path "/var/lib/kubelet/pods/43fc349d-c86e-11e8-a0aa-001018759bc8/volumes" does not exist
kubelet[26132]: W1005 09:34:34.841888 26132 kubelet_getters.go:264] Path "/var/lib/kubelet/pods/7c7d1db45cb11bf12de2eac803da8b77/volumes" does not exist
kubelet[26132]: W1005 09:34:34.841935 26132 kubelet_getters.go:264] Path "/var/lib/kubelet/pods/43fbcf1b-c86e-11e8-a0aa-001018759bc8/volumes" does not exist
kubelet[26132]: I1005 09:34:34.880168 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:34.880564 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:34.880645 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:43.121992 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:34:53.165661 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
sshd[26621]: Connection closed by 172.29.2.56 port 50080 [preauth]
kubelet[26132]: I1005 09:35:03.210021 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:35:13.252179 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
kubelet[26132]: I1005 09:35:23.295605 26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
有什么想法吗?
编辑:
比较我发现的节点上的kubelet,那个kubelet在其他两个节点上是这样启动的:
kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni
TLS 超时后,我在第三个节点上使用此命令导致:
I1005 .008343 server.go:408] Version: v1.12.0
I1005 .008857 plugins.go:99] No cloud provider specified.
I1005 .045644 certificate_store.go:131] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
I1005 .134861 server.go:667] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /
I1005 .135501 container_manager_linux.go:247] container manager verified user specified cgroup-root exists: []
I1005 .135551 container_manager_linux.go:252] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>} {Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerReconcilePeriod:10s ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms}
I1005 .135777 container_manager_linux.go:271] Creating device plugin manager: true
I1005 .135829 state_mem.go:36] [cpumanager] initializing new in-memory state store
I1005 .136055 state_mem.go:84] [cpumanager] updated default cpuset: ""
I1005 .136084 state_mem.go:92] [cpumanager] updated cpuset assignments: "map[]"
I1005 .136410 kubelet.go:279] Adding pod path: /etc/kubernetes/manifests
I1005 .136461 kubelet.go:304] Watching apiserver
I1005 .141009 client.go:75] Connecting to docker on unix:///var/run/docker.sock
I1005 .141054 client.go:104] Start docker client with request timeout=2m0s
W1005 .143351 docker_service.go:540] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
I1005 .143395 docker_service.go:236] Hairpin mode set to "hairpin-veth"
W1005 .143618 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
W1005 .147722 hostport_manager.go:68] The binary conntrack is not installed, this can cause failures in network connection cleanup.
W1005 .147880 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
I1005 .147944 docker_service.go:251] Docker cri networking managed by cni
I1005 .177322 docker_service.go:256] Docker Info: &{ID:LJUT:6WWB:WNW2:UJGM:R5HT:4POO:QL2M:PFOI:OKZN:OBP2:ODQS:SSJU Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:19 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type true] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:18 OomKillDisable:true NGoroutines:27 SystemTime:2018-10-05T .158551524+02:00 LoggingDriver:json-file CgroupDriver:cgroupfs NEventsListener:0 KernelVersion:4.18.5-1.el7.elrepo.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc4201e65b0 NCPU:40 MemTotal:134664974336 GenericResources:[] DockerRootDir:/export/data/docker HTTPProxy: HTTPSProxy: NoProxy: Name:dax Labels:[] ExperimentalBuild:false ServerVersion:17.06.2-ce ClusterStore: ClusterAdvertise: Runtimes:map[runc:{Path:docker-runc Args:[]}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil>} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID:6e23458c129b551d5c9871e5174f6b1b7f6d1170 Expected:6e23458c129b551d5c9871e5174f6b1b7f6d1170} RuncCommit:{ID:810190ceaa507aa2727d7ae6f4790c76ec150bd2 Expected:810190ceaa507aa2727d7ae6f4790c76ec150bd2} InitCommit:{ID:949e6fa Expected:949e6fa} SecurityOptions:[name=seccomp,profile=default]}
I1005 .177565 docker_service.go:269] Setting cgroupDriver to cgroupfs
I1005 .211074 kuberuntime_manager.go:197] Container runtime docker initialized, version: 17.06.2-ce, apiVersion: 1.30.0
I1005 .213560 server.go:1013] Started kubelet
E1005 .213611 kubelet.go:1287] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data in memory cache
I1005 .213712 server.go:133] Starting to listen on 0.0.0.0:10250
I1005 .216143 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
I1005 .216334 status_manager.go:152] Starting to sync pod status with apiserver
I1005 .216447 kubelet.go:1804] Starting kubelet main sync loop.
I1005 .216962 kubelet.go:1821] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s]
I1005 .218285 volume_manager.go:248] Starting Kubelet Volume Manager
I1005 .218904 desired_state_of_world_populator.go:130] Desired state populator starts to run
I1005 .220387 server.go:318] Adding debug handlers to kubelet server.
W1005 .221605 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
E1005 .221954 kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
E1005 .317227 kubelet.go:2236] node "dax" not found
I1005 .317229 kubelet.go:1821] skipping pod synchronization - [container runtime is down]
I1005 .318558 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
I1005 .323926 kubelet_node_status.go:70] Attempting to register node dax
I1005 .332022 kubelet_node_status.go:73] Successfully registered node dax
I1005 .417546 kuberuntime_manager.go:910] updating runtime config through cri with podcidr 10.244.3.0/24
I1005 .418060 docker_service.go:345] docker cri received runtime config &RuntimeConfig{NetworkConfig:&NetworkConfig{PodCidr:10.244.3.0/24,},}
I1005 .418505 kubelet_network.go:75] Setting Pod CIDR: -> 10.244.3.0/24
I1005 .465985 cpu_manager.go:155] [cpumanager] starting with none policy
I1005 .466004 cpu_manager.go:156] [cpumanager] reconciling every 10s
I1005 .466012 policy_none.go:42] [cpumanager] none policy: Start
W1005 .466606 manager.go:527] Failed to retrieve checkpoint for "kubelet_internal_checkpoint": checkpoint is not found
W1005 .467018 container_manager_linux.go:803] CPUAccounting not enabled for pid:
W1005 .467029 container_manager_linux.go:806] MemoryAccounting not enabled for pid:
W1005 .467770 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
E1005 .467952 kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
I1005 .520111 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "lib-modules" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-lib-modules") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .520186 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "var-run-calico" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-var-run-calico") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .520296 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "run" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-run") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .520485 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "cni-net-dir" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-cni-net-dir") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .520581 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kube-proxy" (UniqueName: "kubernetes.io/configmap/dde74f33-c893-11e8-a0aa-001018759bc8-kube-proxy") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
I1005 .520641 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "lib-modules" (UniqueName: "kubernetes.io/host-path/dde74f33-c893-11e8-a0aa-001018759bc8-lib-modules") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
I1005 .520697 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "var-lib-calico" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-var-lib-calico") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .520755 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "flannel-cfg" (UniqueName: "kubernetes.io/configmap/dde7c5af-c893-11e8-a0aa-001018759bc8-flannel-cfg") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .520855 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "cni-bin-dir" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-cni-bin-dir") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .520952 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "canal-token-nsdwz" (UniqueName: "kubernetes.io/secret/dde7c5af-c893-11e8-a0aa-001018759bc8-canal-token-nsdwz") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
I1005 .521094 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "xtables-lock" (UniqueName: "kubernetes.io/host-path/dde74f33-c893-11e8-a0aa-001018759bc8-xtables-lock") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
I1005 .521160 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kube-proxy-token-zjtdh" (UniqueName: "kubernetes.io/secret/dde74f33-c893-11e8-a0aa-001018759bc8-kube-proxy-token-zjtdh") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
I1005 .521232 reconciler.go:154] Reconciler: start to sync state
E1005 .537905 summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
E1005 .574965 summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
E1005 .613275 summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
E1005 .656607 summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
我自己找到了解决方案 - /etc/systemd/system/kubelet.service.d 中的配置文件使用了错误的启动参数 - 我更改了它们并解决了我的问题
文件 20-etcd-service-manager.conf 包含值
ExecStart=/usr/bin/kubelet --address=127.0.0.1
--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true
引起了我的问题。我改成了
ExecStart=/usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni
因为这些是我其他节点的参数。 删除文件可能会更好,这样它就不会覆盖任何其他设置
非常感谢您添加解决方案!这就是我在我的案例中所做的原因:
- 卸载并清除 kubelet、kubeadm 和 kubectl。
- 清除/etc/systemd/system/kubelnet.service.d
- 重新安装并重试。
在 Ubuntu 上:
apt-get remove --purge kubelet kubeadm kubectl
rm -rf /etc/systemd/system/kubelnet.service.d
apt-get install kubelet kubeadm kubectl
kubeadm join ...