我们如何增加 kubernetes 工作节点中临时存储的大小
How can we increase the size of ephemeral storage in a kubernetes worker node
我们用 kubeadm 部署了一个集群(1 master 4 worker 节点)。
$ kubectl describe node worker1
Name: worker1
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=worker1
role=slave1
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 24 Sep 2019 14:15:42 +0330
Taints: node.kubernetes.io/disk-pressure:NoSchedule
Unschedulable: false
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Tue, 24 Sep 2019 14:16:19 +0330 Tue, 24 Sep 2019 14:16:19 +0330 WeaveIsUp Weave pod has set this
OutOfDisk False Mon, 07 Oct 2019 15:35:53 +0330 Sun, 06 Oct 2019 02:21:55 +0330 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Mon, 07 Oct 2019 15:35:53 +0330 Sun, 06 Oct 2019 02:21:55 +0330 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure True Mon, 07 Oct 2019 15:35:53 +0330 Mon, 07 Oct 2019 13:58:23 +0330 KubeletHasDiskPressure kubelet has disk pressure
PIDPressure False Mon, 07 Oct 2019 15:35:53 +0330 Tue, 24 Sep 2019 14:15:42 +0330 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Mon, 07 Oct 2019 15:35:53 +0330 Sun, 06 Oct 2019 02:21:55 +0330 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 192.168.88.206
Hostname: worker1
Capacity:
attachable-volumes-azure-disk: 16
cpu: 4
ephemeral-storage: 19525500Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16432464Ki
pods: 110
Allocatable:
attachable-volumes-azure-disk: 16
cpu: 4
ephemeral-storage: 17994700771
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16330064Ki
pods: 110
System Info:
Machine ID: 2fc8f9eejgh5274kg1ab3f5b6570a8
System UUID: 52454D5843-391B-5454-BC35-E0EC5454D19A
Boot ID: 5454514e-4e5f-4e46-af9b-2809f394e06f
Kernel Version: 4.4.0-116-generic
OS Image: Ubuntu 16.04.4 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://17.3.2
Kubelet Version: v1.12.1
Kube-Proxy Version: v1.12.1
Non-terminated Pods: (0 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 0 (0%) 0 (0%)
memory 0 (0%) 0 (0%)
attachable-volumes-azure-disk 0 0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Starting 45m kube-proxy, worker1 Starting kube-proxy.
Normal Starting 23m kube-proxy, worker1 Starting kube-proxy.
Warning EvictionThresholdMet 2m29s (x502 over 5d5h) kubelet, worker1 Attempting to reclaim ephemeral-storage
Normal Starting 75s kube-proxy, worker1 Starting kube-proxy.
从worker1的描述可以看出,存在磁盘压力(ephemeral-storage: 19525500Ki)。我们挂载一个硬盘到/dev/sdb1。
在工人 1 中:
$ df -h`
Filesystem Size Used Avail Use% Mounted on
udev 7.9G 0 7.9G 0% /dev
tmpfs 1.6G 163M 1.5G 11% /run
/dev/sda1 19G 16G 2.4G 87% /
tmpfs 7.9G 5.1M 7.9G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 7.9G 0 7.9G 0% /sys/fs/cgroup
/dev/sdb1 99G 61M 94G 1% /data
tmpfs 1.6G 0 1.6G 0% /run/user/1003
但是问题依然存在。我如何告诉 kubelet 将这个挂载点添加到 worker1 的临时存储中?实际上,我们如何增加 Kubernetes 集群中节点的临时存储?
不幸的是,(AFAIK)kubelet which runs on your node doesn't really have a SIGHUP mechanism to hang up and pick a new configuration like other application like Nginx。简短的回答是您将不得不重新启动 kubelet。通常:
$ systemctl restart kubelet
如果您不希望您的应用程序 运行 受到影响,Kubernetes 确实有一种机制:drain and cordon。
如果你想自己关闭节点上的 pods 并注意什么在什么时候关闭等,你可以使用 cordon 来防止在该节点上安排任何工作负载:
$ kubectl cordon <nodename>
如果您希望 Kubernetes 在该节点上驱逐您的 pods(同时使用 cordon 使不可调度):
$ kubectl drain <nodename>
drain 的一个好处是它支持 PodDisruptionBudget 资源,允许您安全地耗尽 pods 而不会影响正常运行时间(假设您已适当定义 pod 中断预算)
经过大量搜索,我决定扩大/sda1的大小。这样做并不愉快,但这是我能找到的唯一方法。现在工人的临时存储增加了。
Filesystem Size Used Avail Use% Mounted on
udev 7.9G 0 7.9G 0% /dev
tmpfs 1.6G 151M 1.5G 10% /run
/dev/sda1 118G 24G 89G 22% /
$kubectl describe node worker1
attachable-volumes-azure-disk: 16
cpu: 4
ephemeral-storage: 123729380Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16432464Ki
pods: 110
Allocatable:
attachable-volumes-azure-disk: 16
cpu: 4
ephemeral-storage: 114028996420
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16330064Ki
pods: 110
我们用 kubeadm 部署了一个集群(1 master 4 worker 节点)。
$ kubectl describe node worker1
Name: worker1
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=worker1
role=slave1
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 24 Sep 2019 14:15:42 +0330
Taints: node.kubernetes.io/disk-pressure:NoSchedule
Unschedulable: false
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Tue, 24 Sep 2019 14:16:19 +0330 Tue, 24 Sep 2019 14:16:19 +0330 WeaveIsUp Weave pod has set this
OutOfDisk False Mon, 07 Oct 2019 15:35:53 +0330 Sun, 06 Oct 2019 02:21:55 +0330 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Mon, 07 Oct 2019 15:35:53 +0330 Sun, 06 Oct 2019 02:21:55 +0330 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure True Mon, 07 Oct 2019 15:35:53 +0330 Mon, 07 Oct 2019 13:58:23 +0330 KubeletHasDiskPressure kubelet has disk pressure
PIDPressure False Mon, 07 Oct 2019 15:35:53 +0330 Tue, 24 Sep 2019 14:15:42 +0330 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Mon, 07 Oct 2019 15:35:53 +0330 Sun, 06 Oct 2019 02:21:55 +0330 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 192.168.88.206
Hostname: worker1
Capacity:
attachable-volumes-azure-disk: 16
cpu: 4
ephemeral-storage: 19525500Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16432464Ki
pods: 110
Allocatable:
attachable-volumes-azure-disk: 16
cpu: 4
ephemeral-storage: 17994700771
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16330064Ki
pods: 110
System Info:
Machine ID: 2fc8f9eejgh5274kg1ab3f5b6570a8
System UUID: 52454D5843-391B-5454-BC35-E0EC5454D19A
Boot ID: 5454514e-4e5f-4e46-af9b-2809f394e06f
Kernel Version: 4.4.0-116-generic
OS Image: Ubuntu 16.04.4 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://17.3.2
Kubelet Version: v1.12.1
Kube-Proxy Version: v1.12.1
Non-terminated Pods: (0 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 0 (0%) 0 (0%)
memory 0 (0%) 0 (0%)
attachable-volumes-azure-disk 0 0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Starting 45m kube-proxy, worker1 Starting kube-proxy.
Normal Starting 23m kube-proxy, worker1 Starting kube-proxy.
Warning EvictionThresholdMet 2m29s (x502 over 5d5h) kubelet, worker1 Attempting to reclaim ephemeral-storage
Normal Starting 75s kube-proxy, worker1 Starting kube-proxy.
从worker1的描述可以看出,存在磁盘压力(ephemeral-storage: 19525500Ki)。我们挂载一个硬盘到/dev/sdb1。
在工人 1 中:
$ df -h`
Filesystem Size Used Avail Use% Mounted on
udev 7.9G 0 7.9G 0% /dev
tmpfs 1.6G 163M 1.5G 11% /run
/dev/sda1 19G 16G 2.4G 87% /
tmpfs 7.9G 5.1M 7.9G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 7.9G 0 7.9G 0% /sys/fs/cgroup
/dev/sdb1 99G 61M 94G 1% /data
tmpfs 1.6G 0 1.6G 0% /run/user/1003
但是问题依然存在。我如何告诉 kubelet 将这个挂载点添加到 worker1 的临时存储中?实际上,我们如何增加 Kubernetes 集群中节点的临时存储?
不幸的是,(AFAIK)kubelet which runs on your node doesn't really have a SIGHUP mechanism to hang up and pick a new configuration like other application like Nginx。简短的回答是您将不得不重新启动 kubelet。通常:
$ systemctl restart kubelet
如果您不希望您的应用程序 运行 受到影响,Kubernetes 确实有一种机制:drain and cordon。
如果你想自己关闭节点上的 pods 并注意什么在什么时候关闭等,你可以使用 cordon 来防止在该节点上安排任何工作负载:
$ kubectl cordon <nodename>
如果您希望 Kubernetes 在该节点上驱逐您的 pods(同时使用 cordon 使不可调度):
$ kubectl drain <nodename>
drain 的一个好处是它支持 PodDisruptionBudget 资源,允许您安全地耗尽 pods 而不会影响正常运行时间(假设您已适当定义 pod 中断预算)
经过大量搜索,我决定扩大/sda1的大小。这样做并不愉快,但这是我能找到的唯一方法。现在工人的临时存储增加了。
Filesystem Size Used Avail Use% Mounted on
udev 7.9G 0 7.9G 0% /dev
tmpfs 1.6G 151M 1.5G 10% /run
/dev/sda1 118G 24G 89G 22% /
$kubectl describe node worker1
attachable-volumes-azure-disk: 16
cpu: 4
ephemeral-storage: 123729380Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16432464Ki
pods: 110
Allocatable:
attachable-volumes-azure-disk: 16
cpu: 4
ephemeral-storage: 114028996420
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16330064Ki
pods: 110