Kubernetes - 主节点中的 kube-system pods 在工作节点加入后不断重启
Kubernetes - kube-system pods in master node keep restarting after worker node joins
我已经关注了这个 tutorial and this tutorial and this one 但我在过去 3 天遇到了同样的问题。
我可以通过以下步骤正确设置主节点:
kubeadm init
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
export kubever=$(kubectl version | base64 | tr -d ‘\’)
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$kubever"
一切似乎都很好
kubectl get all --namespace=kube-system
然后,
在工作节点上:
kubeadm join --token 864655.fdf6d0b389867b79 192.168.100.17:6443 --discovery-token-ca-cert-hash sha256:a2d840808b17b53b9612e6271ccde489f13dbede7d354f97188d0faa9e210af2
输出看起来不错,如下所示:
[preflight] Running pre-flight checks.
[WARNING FileExisting-crictl]: crictl not found in system path
[preflight] Starting the kubelet service
[discovery] Trying to connect to API Server "192.168.100.17:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.100.17:6443"
[discovery] Requesting info from "https://192.168.100.17:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.100.17:6443"
[discovery] Successfully established connection with API Server "192.168.100.17:6443"
This node has joined the cluster:
* Certificate signing request was sent to master and a response
was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the master to see this node join the cluster.
BUT 一旦我 运行 这个命令,一切都会崩溃。
kubectl get all --namespace=kube-system
开始显示所有 pods 都在重新启动。状态在 Pending 和 运行 之间不断变化,有时一些 pods 甚至会消失,并且可能有 ContainerCreating 状态等
NAME READY STATUS RESTARTS AGE
po/etcd-ubuntu 0/1 Pending 0 0s
po/kube-controller-manager-ubuntu 0/1 Pending 0 0s
po/kube-dns-6f4fd4bdf-cmcfk 3/3 Running 0 13m
po/kube-proxy-2chb6 1/1 Running 0 13m
po/kube-scheduler-ubuntu 0/1 Pending 0 0s
po/weave-net-ptdxr 2/2 Running 0 11m
我也尝试了第二个教程,使用 flannel,得到了完全相同的问题。
我的设置
我在 VMware 上创建了两个全新安装 Ubuntu 17.10 的新虚拟机,每个虚拟机有 2 processor/2core 6 GB 内存和 50 GB 硬盘。我的物理机是 i7-6700k,内存为 32gb。
我在它们上面都安装了 kubeadm、kubelet 和 docker,然后按照上面提到的步骤进行操作。
我也试过在 VMware 上的 NAT 和 Bridge 之间切换,但没有任何改变。
两台带网桥的虚拟机初始IP为192.168.100.12和192.168.100.17。
hostname -I
for master:
192.168.100.17 172.17.0.1 10.32.0.1 10.32.0.2
工作节点的hostname -I
:
192.168.100.12 172.17.0.1 10.44.0.0 10.32.0.1
journalctl -xeu kubelet
显示如下:
https://gist.github.com/saad749/9a771a3460bf88c274498b5bc4b7fd84
在尝试使用法兰绒(仍然是同样的问题)时,
的结果
kubectl describe nodes
是
https://gist.github.com/saad749/d24c453c8b4e663e9abf572a0fb38bf4
我是否遗漏了 kubeadm init 之前的任何步骤?我应该更改 IP 地址(更改为什么)?有没有我应该查看的特定日志?有更全面的教程吗?
所有问题都是在工作节点上的 kubeadm 加入后开始的,我可以在主节点或任何其他东西上部署 kubernetes,它工作正常。
更新:
即使应用了 errordeveloper 的建议,同样的问题仍然存在。
我将以下标志添加到 kubeadm init 中:
--apiserver-advertise-address 192.168.100.17
我将 kubeadm.conf 更新为以下内容并重新加载并重新启动:
https://gist.github.com/saad749/c7149c87ec3e75a40586f626cf04279a
并且还尝试更改集群 dns
https://gist.github.com/saad749/5fa66bebc22841e58119333e75600e40
初始化master后的日志:
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system etcd-ubuntu 1/1 Running 0 22s 192.168.100.17 ubuntu
kube-system kube-apiserver-ubuntu 1/1 Running 0 29s 192.168.100.17 ubuntu
kube-system kube-controller-manager-ubuntu 1/1 Running 0 13s 192.168.100.17 ubuntu
kube-system kube-dns-6f4fd4bdf-wfqhb 3/3 Running 0 1m 10.32.0.7 ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 1m 192.168.100.17 ubuntu
kube-system kube-scheduler-ubuntu 1/1 Running 0 34s 192.168.100.17 ubuntu
kube-system weave-net-fkgnh 2/2 Running 0 32s 192.168.100.17 ubuntu
hostname -i 结果:
kube-master@ubuntu:~$ hostname -I
192.168.100.17 172.17.0.1 10.32.0.1 10.32.0.2 10.32.0.3 10.32.0.4 10.32.0.5 10.32.0.6 10.244.0.0 10.244.0.1
kube-master@ubuntu:~$ hostname -i
192.168.100.17
结果来自:
kubectl describe nodes
https://gist.github.com/saad749/8f460650182a04d0ddf3158a52761a9a
内部 IP 现在似乎是正确的。
从第二个节点加入后,会发生这种情况:
kube-master@ubuntu:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ubuntu Ready master 49m v1.9.3
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system kube-controller-manager-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system kube-dns-6f4fd4bdf-wfqhb 0/3 ContainerCreating 0 49m <none> ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 49m 192.168.100.17 ubuntu
kube-system kube-scheduler-ubuntu 1/1 Running 0 1s 192.168.100.17 ubuntu
kube-system weave-net-fkgnh 2/2 Running 0 48m 192.168.100.17 ubuntu
ifconfig -a 结果:
https://gist.github.com/saad749/63a5a52bd3246ff72477b2aca7d158d0
journalctl -xeu kubelet 结果
https://gist.github.com/saad749/8a60870b35f93df8565e66cb208aff32
有时pods IP显示为192.168.100.12,这是非主控第二节点的IP
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system etcd-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system kube-apiserver-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system kube-controller-manager-ubuntu 1/1 Running 0 0s 192.168.100.12 ubuntu
kube-system kube-dns-6f4fd4bdf-wfqhb 2/3 Running 0 3h 10.32.0.7 ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 3h 192.168.100.12 ubuntu
kube-system kube-scheduler-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system weave-net-fkgnh 2/2 Running 1 3h 192.168.100.17 ubuntu
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system kube-dns-6f4fd4bdf-wfqhb 3/3 Running 0 3h 10.32.0.7 ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 3h 192.168.100.12 ubuntu
kube-system weave-net-fkgnh 2/2 Running 0 3h 192.168.100.12 ubuntu
kubectl describe nodes
Name: ubuntu
Roles: master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=ubuntu
node-role.kubernetes.io/master=
Annotations: node.alpha.kubernetes.io/ttl=0
volumes.kubernetes.io/controller-managed-attach-detach=true
Taints: node-role.kubernetes.io/master:NoSchedule
CreationTimestamp: Fri, 02 Mar 2018 08:21:47 -0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk False Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 08:21:43 -0800 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 08:21:43 -0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 08:21:43 -0800 KubeletHasNoDiskPressure kubelet has no disk pressure
Ready True Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 11:28:25 -0800 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 192.168.100.12
Hostname: ubuntu
Capacity:
cpu: 4
memory: 6080832Ki
pods: 110
Allocatable:
cpu: 4
memory: 5978432Ki
pods: 110
System Info:
Machine ID: 59bf65b835b242a3aa182f4b8a542219
System UUID: 0C3C4D56-4747-D59E-EE09-F16F2793677E
Boot ID: 658b4a08-d724-425e-9246-2b41995ecc46
Kernel Version: 4.13.0-36-generic
OS Image: Ubuntu 17.10
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://1.13.1
Kubelet Version: v1.9.3
Kube-Proxy Version: v1.9.3
ExternalID: ubuntu
Non-terminated Pods: (3 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
kube-system kube-dns-6f4fd4bdf-wfqhb 260m (6%) 0 (0%) 110Mi (1%) 170Mi (2%)
kube-system kube-proxy-h4hz9 0 (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system weave-net-fkgnh 20m (0%) 0 (0%) 0 (0%) 0 (0%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
280m (7%) 0 (0%) 110Mi (1%) 170Mi (2%)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Rebooted 12m (x814 over 2h) kubelet, ubuntu Node ubuntu has been rebooted, boot id: 16efd500-a2a5-446f-ba25-1187857996e0
Normal NodeHasNoDiskPressure 10m kubelet, ubuntu Node ubuntu status is now: NodeHasNoDiskPressure
Normal Starting 10m kubelet, ubuntu Starting kubelet.
Normal NodeAllocatableEnforced 10m kubelet, ubuntu Updated Node Allocatable limit across pods
Normal NodeHasSufficientDisk 10m kubelet, ubuntu Node ubuntu status is now: NodeHasSufficientDisk
Normal NodeHasSufficientMemory 10m kubelet, ubuntu Node ubuntu status is now: NodeHasSufficientMemory
Normal NodeNotReady 10m kubelet, ubuntu Node ubuntu status is now: NodeNotReady
Warning Rebooted 2m (x870 over 2h) kubelet, ubuntu Node ubuntu has been rebooted, boot id: 658b4a08-d724-425e-9246-2b41995ecc46
Warning Rebooted 15s (x60 over 10m) kubelet, ubuntu Node ubuntu has been rebooted, boot id: 16efd500-a2a5-446f-ba25-1187857996e0
我做错了什么?
Should I change the IP addresses (to what)?
是的,这通常是在默认路由用于通过 NAT 访问 Internet 的 VM 上运行的典型方法。
您要使用桥接网络的 IP,您的主人似乎是 192.168.100.17
(但请仔细检查)。
首先,请尝试使用 kubeadm init --apiserver-advertise-address 192.168.100.17
,但这可能无法解决所有问题。
在你 kubectl describe nodes
的输出中,我可以看到这个
Addresses:
InternalIP: 172.17.0.1
Hostname: ubuntu
所以你可能想确保 kubelet 也不使用 NATed 接口,为此你需要使用 kubelet 的 --node-ip
标志。
但是,还有其他方法可以解决此问题,例如如果您可以确保 hostname -i
returns 桥接接口的 IP(您可以通过调整 /etc/hosts
来实现)。
所以在听从@errordeveloper 的建议后仍然碰壁,我能够解决这个问题,结果证明这个问题非常简单。
我的两个虚拟机都有相同的主机名。
hostname -f
会return
ubuntu
在两者上,这显然会导致 kubernetes 出现问题。
我用
更改了 non-master 节点上的名称
hostnamectl set-hostname kminion
并在以下文件中:
/etc/hostname
/etc/hosts
一切顺利!
我已经关注了这个 tutorial and this tutorial and this one 但我在过去 3 天遇到了同样的问题。
我可以通过以下步骤正确设置主节点:
kubeadm init
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
export kubever=$(kubectl version | base64 | tr -d ‘\’)
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$kubever"
一切似乎都很好
kubectl get all --namespace=kube-system
然后,
在工作节点上:
kubeadm join --token 864655.fdf6d0b389867b79 192.168.100.17:6443 --discovery-token-ca-cert-hash sha256:a2d840808b17b53b9612e6271ccde489f13dbede7d354f97188d0faa9e210af2
输出看起来不错,如下所示:
[preflight] Running pre-flight checks.
[WARNING FileExisting-crictl]: crictl not found in system path
[preflight] Starting the kubelet service
[discovery] Trying to connect to API Server "192.168.100.17:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.100.17:6443"
[discovery] Requesting info from "https://192.168.100.17:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.100.17:6443"
[discovery] Successfully established connection with API Server "192.168.100.17:6443"
This node has joined the cluster:
* Certificate signing request was sent to master and a response
was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the master to see this node join the cluster.
BUT 一旦我 运行 这个命令,一切都会崩溃。
kubectl get all --namespace=kube-system
开始显示所有 pods 都在重新启动。状态在 Pending 和 运行 之间不断变化,有时一些 pods 甚至会消失,并且可能有 ContainerCreating 状态等
NAME READY STATUS RESTARTS AGE
po/etcd-ubuntu 0/1 Pending 0 0s
po/kube-controller-manager-ubuntu 0/1 Pending 0 0s
po/kube-dns-6f4fd4bdf-cmcfk 3/3 Running 0 13m
po/kube-proxy-2chb6 1/1 Running 0 13m
po/kube-scheduler-ubuntu 0/1 Pending 0 0s
po/weave-net-ptdxr 2/2 Running 0 11m
我也尝试了第二个教程,使用 flannel,得到了完全相同的问题。
我的设置
我在 VMware 上创建了两个全新安装 Ubuntu 17.10 的新虚拟机,每个虚拟机有 2 processor/2core 6 GB 内存和 50 GB 硬盘。我的物理机是 i7-6700k,内存为 32gb。 我在它们上面都安装了 kubeadm、kubelet 和 docker,然后按照上面提到的步骤进行操作。
我也试过在 VMware 上的 NAT 和 Bridge 之间切换,但没有任何改变。
两台带网桥的虚拟机初始IP为192.168.100.12和192.168.100.17。
hostname -I
for master:
192.168.100.17 172.17.0.1 10.32.0.1 10.32.0.2
工作节点的hostname -I
:
192.168.100.12 172.17.0.1 10.44.0.0 10.32.0.1
journalctl -xeu kubelet
显示如下:
https://gist.github.com/saad749/9a771a3460bf88c274498b5bc4b7fd84
在尝试使用法兰绒(仍然是同样的问题)时,
的结果kubectl describe nodes
是
https://gist.github.com/saad749/d24c453c8b4e663e9abf572a0fb38bf4
我是否遗漏了 kubeadm init 之前的任何步骤?我应该更改 IP 地址(更改为什么)?有没有我应该查看的特定日志?有更全面的教程吗? 所有问题都是在工作节点上的 kubeadm 加入后开始的,我可以在主节点或任何其他东西上部署 kubernetes,它工作正常。
更新:
即使应用了 errordeveloper 的建议,同样的问题仍然存在。
我将以下标志添加到 kubeadm init 中:
--apiserver-advertise-address 192.168.100.17
我将 kubeadm.conf 更新为以下内容并重新加载并重新启动: https://gist.github.com/saad749/c7149c87ec3e75a40586f626cf04279a
并且还尝试更改集群 dns https://gist.github.com/saad749/5fa66bebc22841e58119333e75600e40
初始化master后的日志:
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system etcd-ubuntu 1/1 Running 0 22s 192.168.100.17 ubuntu
kube-system kube-apiserver-ubuntu 1/1 Running 0 29s 192.168.100.17 ubuntu
kube-system kube-controller-manager-ubuntu 1/1 Running 0 13s 192.168.100.17 ubuntu
kube-system kube-dns-6f4fd4bdf-wfqhb 3/3 Running 0 1m 10.32.0.7 ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 1m 192.168.100.17 ubuntu
kube-system kube-scheduler-ubuntu 1/1 Running 0 34s 192.168.100.17 ubuntu
kube-system weave-net-fkgnh 2/2 Running 0 32s 192.168.100.17 ubuntu
hostname -i 结果:
kube-master@ubuntu:~$ hostname -I
192.168.100.17 172.17.0.1 10.32.0.1 10.32.0.2 10.32.0.3 10.32.0.4 10.32.0.5 10.32.0.6 10.244.0.0 10.244.0.1
kube-master@ubuntu:~$ hostname -i
192.168.100.17
结果来自:
kubectl describe nodes
https://gist.github.com/saad749/8f460650182a04d0ddf3158a52761a9a
内部 IP 现在似乎是正确的。
从第二个节点加入后,会发生这种情况:
kube-master@ubuntu:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ubuntu Ready master 49m v1.9.3
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system kube-controller-manager-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system kube-dns-6f4fd4bdf-wfqhb 0/3 ContainerCreating 0 49m <none> ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 49m 192.168.100.17 ubuntu
kube-system kube-scheduler-ubuntu 1/1 Running 0 1s 192.168.100.17 ubuntu
kube-system weave-net-fkgnh 2/2 Running 0 48m 192.168.100.17 ubuntu
ifconfig -a 结果:
https://gist.github.com/saad749/63a5a52bd3246ff72477b2aca7d158d0
journalctl -xeu kubelet 结果
https://gist.github.com/saad749/8a60870b35f93df8565e66cb208aff32
有时pods IP显示为192.168.100.12,这是非主控第二节点的IP
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system etcd-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system kube-apiserver-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system kube-controller-manager-ubuntu 1/1 Running 0 0s 192.168.100.12 ubuntu
kube-system kube-dns-6f4fd4bdf-wfqhb 2/3 Running 0 3h 10.32.0.7 ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 3h 192.168.100.12 ubuntu
kube-system kube-scheduler-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system weave-net-fkgnh 2/2 Running 1 3h 192.168.100.17 ubuntu
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system kube-dns-6f4fd4bdf-wfqhb 3/3 Running 0 3h 10.32.0.7 ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 3h 192.168.100.12 ubuntu
kube-system weave-net-fkgnh 2/2 Running 0 3h 192.168.100.12 ubuntu
kubectl describe nodes
Name: ubuntu
Roles: master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=ubuntu
node-role.kubernetes.io/master=
Annotations: node.alpha.kubernetes.io/ttl=0
volumes.kubernetes.io/controller-managed-attach-detach=true
Taints: node-role.kubernetes.io/master:NoSchedule
CreationTimestamp: Fri, 02 Mar 2018 08:21:47 -0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk False Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 08:21:43 -0800 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 08:21:43 -0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 08:21:43 -0800 KubeletHasNoDiskPressure kubelet has no disk pressure
Ready True Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 11:28:25 -0800 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 192.168.100.12
Hostname: ubuntu
Capacity:
cpu: 4
memory: 6080832Ki
pods: 110
Allocatable:
cpu: 4
memory: 5978432Ki
pods: 110
System Info:
Machine ID: 59bf65b835b242a3aa182f4b8a542219
System UUID: 0C3C4D56-4747-D59E-EE09-F16F2793677E
Boot ID: 658b4a08-d724-425e-9246-2b41995ecc46
Kernel Version: 4.13.0-36-generic
OS Image: Ubuntu 17.10
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://1.13.1
Kubelet Version: v1.9.3
Kube-Proxy Version: v1.9.3
ExternalID: ubuntu
Non-terminated Pods: (3 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
kube-system kube-dns-6f4fd4bdf-wfqhb 260m (6%) 0 (0%) 110Mi (1%) 170Mi (2%)
kube-system kube-proxy-h4hz9 0 (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system weave-net-fkgnh 20m (0%) 0 (0%) 0 (0%) 0 (0%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
280m (7%) 0 (0%) 110Mi (1%) 170Mi (2%)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Rebooted 12m (x814 over 2h) kubelet, ubuntu Node ubuntu has been rebooted, boot id: 16efd500-a2a5-446f-ba25-1187857996e0
Normal NodeHasNoDiskPressure 10m kubelet, ubuntu Node ubuntu status is now: NodeHasNoDiskPressure
Normal Starting 10m kubelet, ubuntu Starting kubelet.
Normal NodeAllocatableEnforced 10m kubelet, ubuntu Updated Node Allocatable limit across pods
Normal NodeHasSufficientDisk 10m kubelet, ubuntu Node ubuntu status is now: NodeHasSufficientDisk
Normal NodeHasSufficientMemory 10m kubelet, ubuntu Node ubuntu status is now: NodeHasSufficientMemory
Normal NodeNotReady 10m kubelet, ubuntu Node ubuntu status is now: NodeNotReady
Warning Rebooted 2m (x870 over 2h) kubelet, ubuntu Node ubuntu has been rebooted, boot id: 658b4a08-d724-425e-9246-2b41995ecc46
Warning Rebooted 15s (x60 over 10m) kubelet, ubuntu Node ubuntu has been rebooted, boot id: 16efd500-a2a5-446f-ba25-1187857996e0
我做错了什么?
Should I change the IP addresses (to what)?
是的,这通常是在默认路由用于通过 NAT 访问 Internet 的 VM 上运行的典型方法。
您要使用桥接网络的 IP,您的主人似乎是 192.168.100.17
(但请仔细检查)。
首先,请尝试使用 kubeadm init --apiserver-advertise-address 192.168.100.17
,但这可能无法解决所有问题。
在你 kubectl describe nodes
的输出中,我可以看到这个
Addresses:
InternalIP: 172.17.0.1
Hostname: ubuntu
所以你可能想确保 kubelet 也不使用 NATed 接口,为此你需要使用 kubelet 的 --node-ip
标志。
但是,还有其他方法可以解决此问题,例如如果您可以确保 hostname -i
returns 桥接接口的 IP(您可以通过调整 /etc/hosts
来实现)。
所以在听从@errordeveloper 的建议后仍然碰壁,我能够解决这个问题,结果证明这个问题非常简单。
我的两个虚拟机都有相同的主机名。
hostname -f
会return
ubuntu
在两者上,这显然会导致 kubernetes 出现问题。
我用
更改了 non-master 节点上的名称hostnamectl set-hostname kminion
并在以下文件中:
/etc/hostname
/etc/hosts
一切顺利!