我们如何增加 kubernetes 工作节点中临时存储的大小

How can we increase the size of ephemeral storage in a kubernetes worker node

我们用 kubeadm 部署了一个集群(1 master 4 worker 节点)。

$ kubectl describe node worker1

Name:               worker1
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/hostname=worker1
                    role=slave1
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Tue, 24 Sep 2019 14:15:42 +0330
Taints:             node.kubernetes.io/disk-pressure:NoSchedule
Unschedulable:      false
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Tue, 24 Sep 2019 14:16:19 +0330   Tue, 24 Sep 2019 14:16:19 +0330   WeaveIsUp                    Weave pod has set this
  OutOfDisk            False   Mon, 07 Oct 2019 15:35:53 +0330   Sun, 06 Oct 2019 02:21:55 +0330   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure       False   Mon, 07 Oct 2019 15:35:53 +0330   Sun, 06 Oct 2019 02:21:55 +0330   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         True    Mon, 07 Oct 2019 15:35:53 +0330   Mon, 07 Oct 2019 13:58:23 +0330   KubeletHasDiskPressure       kubelet has disk pressure
  PIDPressure          False   Mon, 07 Oct 2019 15:35:53 +0330   Tue, 24 Sep 2019 14:15:42 +0330   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Mon, 07 Oct 2019 15:35:53 +0330   Sun, 06 Oct 2019 02:21:55 +0330   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  InternalIP:  192.168.88.206
  Hostname:    worker1
Capacity:
 attachable-volumes-azure-disk:  16
 cpu:                            4
 ephemeral-storage:              19525500Ki
 hugepages-1Gi:                  0
 hugepages-2Mi:                  0
 memory:                         16432464Ki
 pods:                           110
Allocatable:
 attachable-volumes-azure-disk:  16
 cpu:                            4
 ephemeral-storage:              17994700771
 hugepages-1Gi:                  0
 hugepages-2Mi:                  0
 memory:                         16330064Ki
 pods:                           110
System Info:
 Machine ID:                 2fc8f9eejgh5274kg1ab3f5b6570a8
 System UUID:                52454D5843-391B-5454-BC35-E0EC5454D19A
 Boot ID:                    5454514e-4e5f-4e46-af9b-2809f394e06f
 Kernel Version:             4.4.0-116-generic
 OS Image:                   Ubuntu 16.04.4 LTS
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://17.3.2
 Kubelet Version:            v1.12.1
 Kube-Proxy Version:         v1.12.1
Non-terminated Pods:         (0 in total)
  Namespace                  Name    CPU Requests  CPU Limits  Memory Requests  Memory Limits
  ---------                  ----    ------------  ----------  ---------------  -------------
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                       Requests  Limits
  --------                       --------  ------
  cpu                            0 (0%)    0 (0%)
  memory                         0 (0%)    0 (0%)
  attachable-volumes-azure-disk  0         0
Events:
  Type     Reason                Age                     From                 Message
  ----     ------                ----                    ----                 -------
  Normal   Starting              45m                     kube-proxy, worker1  Starting kube-proxy.
  Normal   Starting              23m                     kube-proxy, worker1  Starting kube-proxy.
  Warning  EvictionThresholdMet  2m29s (x502 over 5d5h)  kubelet, worker1     Attempting to reclaim ephemeral-storage
  Normal   Starting              75s                     kube-proxy, worker1  Starting kube-proxy.

从worker1的描述可以看出,存在磁盘压力(ephemeral-storage: 19525500Ki)。我们挂载一个硬盘到/dev/sdb1

在工人 1 中:

$ df -h`

Filesystem      Size  Used Avail Use% Mounted on
udev            7.9G     0  7.9G   0% /dev
tmpfs           1.6G  163M  1.5G  11% /run
/dev/sda1        19G   16G  2.4G  87% /
tmpfs           7.9G  5.1M  7.9G   1% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           7.9G     0  7.9G   0% /sys/fs/cgroup
/dev/sdb1        99G   61M   94G   1% /data
tmpfs           1.6G     0  1.6G   0% /run/user/1003

但是问题依然存在。我如何告诉 kubelet 将这个挂载点添加到 worker1 的临时存储中?实际上,我们如何增加 Kubernetes 集群中节点的临时存储?

不幸的是,(AFAIK)kubelet which runs on your node doesn't really have a SIGHUP mechanism to hang up and pick a new configuration like other application like Nginx。简短的回答是您将不得不重新启动 kubelet。通常:

$ systemctl restart kubelet

如果您不希望您的应用程序 运行 受到影响,Kubernetes 确实有一种机制:drain and cordon

如果你想自己关闭节点上的 pods 并注意什么在什么时候关闭等,你可以使用 cordon 来防止在该节点上安排任何工作负载:

$ kubectl cordon <nodename>

如果您希望 Kubernetes 在该节点上驱逐您的 pods(同时使用 cordon 使不可调度):

$ kubectl drain <nodename>

drain 的一个好处是它支持 PodDisruptionBudget 资源,允许您安全地耗尽 pods 而不会影响正常运行时间(假设您已适当定义 pod 中断预算)

经过大量搜索,我决定扩大/sda1的大小。这样做并不愉快,但这是我能找到的唯一方法。现在工人的临时存储增加了。

Filesystem      Size  Used Avail Use% Mounted on
udev            7.9G     0  7.9G   0% /dev
tmpfs           1.6G  151M  1.5G  10% /run
/dev/sda1       118G   24G   89G  22% /

$kubectl describe node worker1

attachable-volumes-azure-disk:  16
 cpu:                            4
 ephemeral-storage:              123729380Ki
 hugepages-1Gi:                  0
 hugepages-2Mi:                  0
 memory:                         16432464Ki
 pods:                           110
Allocatable:
 attachable-volumes-azure-disk:  16
 cpu:                            4
 ephemeral-storage:              114028996420
 hugepages-1Gi:                  0
 hugepages-2Mi:                  0
 memory:                         16330064Ki
 pods:                           110