Kubernetes Kubelet 无权访问 Docker

Kubernetes Kubelet doesn't have access to Docker

我有一个 5 节点的 Kubernetes 集群,其中 1 个是主节点(使用 kubeadm 设置)。当我第一次部署主节点时,我还部署了 kubernetes 仪表板,因此它在同一台机器上 运行。之后我将其他节点加入集群。

现在,当我使用 YAML 文件部署 pod 时,它会保持 ContainerCreating 状态。所以我描述了 pod 并看到了部署它的机器。我在机器上 SSH 并首先检查 docker ps -a 我可以确定图像没有启动,甚至没有被拉出。所以我查看了 kubelet 日志(我没有复制所有内容,但这会提供一个很好的主意):

E0131 11:05:40.486422    2873 server.go:459] Kubelet needs to run as uid `0`. It is being run as 1000
W0131 11:05:40.486616    2873 server.go:469] write /proc/self/oom_score_adj: permission denied
W0131 11:05:40.486978    2873 server.go:669] No api server defined - no events will be sent to API server.
W0131 11:05:40.491423    2873 kubelet_network.go:69] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
I0131 11:05:40.491498    2873 kubelet.go:477] Hairpin mode set to "hairpin-veth"
W0131 11:05:40.495353    2873 plugins.go:210] can't set sysctl net/bridge/bridge-nf-call-iptables: open /proc/sys/net/bridge/bridge-nf-call-iptables: permission denied
I0131 11:05:40.503259    2873 docker_manager.go:257] Setting dockerRoot to /var/lib/docker
I0131 11:05:40.503308    2873 docker_manager.go:260] Setting cgroupDriver to cgroupfs
I0131 11:05:40.506028    2873 server.go:770] Started kubelet v1.5.2
E0131 11:05:40.506209    2873 server.go:481] Starting health server failed: listen tcp 127.0.0.1:10248: bind: address already in use
E0131 11:05:40.506300    2873 kubelet.go:1145] Image garbage collection failed: unable to find data for container /
I0131 11:05:40.506413    2873 server.go:123] Starting to listen on 0.0.0.0:10250
W0131 11:05:40.506445    2873 kubelet.go:1224] No api server defined - no node status update will be sent.
E0131 11:05:40.507209    2873 kubelet.go:1228] error creating pods directory: mkdir /var/lib/kubelet/pods: permission denied
I0131 11:05:40.509613    2873 status_manager.go:125] Kubernetes client is nil, not starting status manager.
I0131 11:05:40.509656    2873 kubelet.go:1714] Starting kubelet main sync loop.
I0131 11:05:40.509710    2873 kubelet.go:1725] skipping pod synchronization - [error creating pods directory: mkdir /var/lib/kubelet/pods: permission denied container runtime is down]
F0131 11:05:40.509522    2873 server.go:148] listen tcp 0.0.0.0:10255: bind: address already in use

有很多权限问题。我不知道如何解决这个问题。我已将 root 和用户帐户添加到 docker 组以查看它是否修复了它,但它没有。

更新

上面我做了一个 kubelet logs 这就是为什么你得到 uid 消息。当我执行 sudo kubelet logs 时,我得到这些结果:

I0201 15:36:01.386564    5082 feature_gate.go:181] feature gates: map[]
W0201 15:36:01.386861    5082 server.go:400] No API client: no api servers specified
I0201 15:36:01.386953    5082 docker.go:356] Connecting to docker on unix:///var/run/docker.sock
I0201 15:36:01.386991    5082 docker.go:376] Start docker client with request timeout=2m0s
I0201 15:36:01.401737    5082 manager.go:143] cAdvisor running in container: "/user.slice"
W0201 15:36:01.415664    5082 manager.go:151] unable to connect to Rkt api service: rkt: cannot tcp Dial rkt api service: dial tcp [::1]:15441: getsockopt: connection refused
I0201 15:36:01.431725    5082 fs.go:117] Filesystem partitions: map[/dev/mmcblk0p2:{mountpoint:/var/lib/docker/aufs major:179 minor:2 fsType:ext4 blockSize:0}]
I0201 15:36:01.440439    5082 manager.go:198] Machine: {NumCores:4 CpuFrequency:1920000 MemoryCapacity:3519315968 MachineID:a9807123b38d1f069a44f0b7588b5884 SystemUUID:03000200-0400-0500-0006-000700080009 BootID:7e71fe9b-a9d8-4921-80c7-9d09e49ed1ef Filesystems:[{Device:/dev/mmcblk0p2 Capacity:57295605760 Type:vfs Inodes:3563520 HasInodes:true}] DiskMap:map[179:0:{Name:mmcblk0 Major:179 Minor:0 Size:62545461248 Scheduler:deadline} 179:8:{Name:mmcblk0boot0 Major:179 Minor:8 Size:4194304 Scheduler:deadline} 179:16:{Name:mmcblk0boot1 Major:179 Minor:16 Size:4194304 Scheduler:deadline} 179:24:{Name:mmcblk0rpmb Major:179 Minor:24 Size:4194304 Scheduler:deadline}] NetworkDevices:[{Name:datapath MacAddress:72:36:99:b2:ba:be Speed:0 Mtu:1410} {Name:dummy0 MacAddress:ea:c7:5e:6d:29:75 Speed:0 Mtu:1500} {Name:enp1s0 MacAddress:00:07:32:3e:17:8c Speed:1000 Mtu:1500} {Name:vxlan-6784 MacAddress:5a:81:bb:f6:00:d7 Speed:0 Mtu:1500} {Name:weave MacAddress:92:64:f5:c5:57:a7 Speed:0 Mtu:1410}] Topology:[{Id:0 Memory:3519315968 Cores:[{Id:0 Threads:[0] Caches:[{Size:24576 Type:Data Level:1} {Size:32768 Type:Instruction Level:1}]} {Id:1 Threads:[1] Caches:[{Size:24576 Type:Data Level:1} {Size:32768 Type:Instruction Level:1}]} {Id:2 Threads:[2] Caches:[{Size:24576 Type:Data Level:1} {Size:32768 Type:Instruction Level:1}]} {Id:3 Threads:[3] Caches:[{Size:24576 Type:Data Level:1} {Size:32768 Type:Instruction Level:1}]}] Caches:[]}] CloudProvider:Unknown InstanceType:Unknown InstanceID:None}
I0201 15:36:01.442170    5082 manager.go:204] Version: {KernelVersion:4.4.0-31-generic ContainerOsVersion:Ubuntu 16.04.1 LTS DockerVersion:1.12.3 CadvisorVersion: CadvisorRevision:}
I0201 15:36:01.444559    5082 cadvisor_linux.go:152] Failed to register cAdvisor on port 4194, retrying. Error: listen tcp :4194: bind: address already in use
W0201 15:36:01.449146    5082 container_manager_linux.go:205] Running with swap on is not supported, please disable swap! This will be a fatal error by default starting in K8s v1.6! In the meantime, you can opt-in to making this a fatal error by enabling --experimental-fail-swap-on.
W0201 15:36:01.449653    5082 server.go:669] No api server defined - no events will be sent to API server.
W0201 15:36:01.457574    5082 kubelet_network.go:69] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
I0201 15:36:01.457658    5082 kubelet.go:477] Hairpin mode set to "hairpin-veth"
I0201 15:36:01.471512    5082 docker_manager.go:257] Setting dockerRoot to /var/lib/docker
I0201 15:36:01.471570    5082 docker_manager.go:260] Setting cgroupDriver to cgroupfs
I0201 15:36:01.474678    5082 server.go:770] Started kubelet v1.5.2
E0201 15:36:01.474926    5082 server.go:481] Starting health server failed: listen tcp 127.0.0.1:10248: bind: address already in use
E0201 15:36:01.475062    5082 kubelet.go:1145] Image garbage collection failed: unable to find data for container /
W0201 15:36:01.475208    5082 kubelet.go:1224] No api server defined - no node status update will be sent.
I0201 15:36:01.475702    5082 kubelet_node_status.go:204] Setting node annotation to enable volume controller attach/detach
I0201 15:36:01.479587    5082 server.go:123] Starting to listen on 0.0.0.0:10250
F0201 15:36:01.481605    5082 server.go:148] listen tcp 0.0.0.0:10255: bind: address already in use

您需要 运行 kubelet 作为 root(查看日志的第一行)。这是目前已知的限制:

https://github.com/kubernetes/kubernetes/issues/4869

kubelet 工具没有 logs 子命令,所以当你 运行 kubelet logs 时,你实际上是在没有任何有效的情况下再次启动 kubelet 进程参数。缺少有效参数是大多数消息的来源,它最终停止 运行 消息 bind: address already in use,因为某些东西,大概是你现有的 kubelet 进程(那个是 运行因为 root) 已经绑定到该端口。

您如何查看 kubelet 的日志取决于您如何设置 kubelet 进程,即对于我的设置 (kops),您可以 journalctl -u kubelet,对于其他设置,您可以寻找 /var/log/<kubelet-log-file>.log 或类似设置。