无法创建请求超过 2Gi 内存的部署

Cannot create a deployment that requests more than 2Gi memory

我的部署 pod 由于内存消耗被逐出:

  Type     Reason   Age   From                                             Message
  ----     ------   ----  ----                                             -------
  Warning  Evicted  1h    kubelet, gke-XXX-default-pool-XXX  The node was low on resource: memory. Container my-container was using 1700040Ki, which exceeds its request of 0.
  Normal   Killing  1h    kubelet, gke-XXX-default-pool-XXX  Killing container with id docker://my-container:Need to kill Pod

我试图通过将以下内容添加到我的部署中来为其授予更多内存 yaml:

apiVersion: apps/v1
kind: Deployment
...
spec:
  ...
  template:
    ...
    spec:
      ...
      containers:

      - name: my-container
        image: my-container:latest
        ...
        resources:
          requests:
            memory: "3Gi"

但是部署失败:

  Type     Reason             Age               From                Message
  ----     ------             ----              ----                -------
  Warning  FailedScheduling   4s (x5 over 13s)  default-scheduler   0/3 nodes are available: 3 Insufficient memory.
  Normal   NotTriggerScaleUp  0s                cluster-autoscaler  pod didn't trigger scale-up (it wouldn't fit if a new node is added)

部署仅请求一个容器。

我正在使用 GKE 自动缩放,默认(也是唯一)池中的节点有 3.75 GB 内存。

通过反复试验,我发现我可以申请的最大内存是“2Gi”。为什么我不能利用单个 pod 的节点的全部 3.75?我需要内存容量更大的节点吗?

每个节点都会为 Kubernetes 系统工作负载(例如 kube-dns 以及您的任何附加组件 select)保留一些内存。这意味着您将无法访问所有节点的 3.75 Gi 内存。

所以要求一个pod预留3Gi内存,确实需要更大内存容量的节点。

即使节点有3.75GB的总内存,很可能可分配的容量并不都是3.75GB。

Kubernetes为系统服务预留部分容量,避免容器在节点消耗过多资源影响系统服务的运行。

来自docs

Kubernetes nodes can be scheduled to Capacity. Pods can consume all the available capacity on a node by default. This is an issue because nodes typically run quite a few system daemons that power the OS and Kubernetes itself. Unless resources are set aside for these system daemons, pods and system daemons compete for resources and lead to resource starvation issues on the node.

因为您使用的是 GKE,所以他们不使用默认设置,运行 以下命令将显示节点中有多少 可分配 资源:

kubectl describe node [NODE_NAME] | grep Allocatable -B 4 -A 3

来自GKE docs:

Allocatable resources are calculated in the following way:

Allocatable = Capacity - Reserved - Eviction Threshold

For memory resources, GKE reserves the following:

  • 25% of the first 4GB of memory
  • 20% of the next 4GB of memory (up to 8GB)
  • 10% of the next 8GB of memory (up to 16GB)
  • 6% of the next 112GB of memory (up to 128GB)
  • 2% of any memory above 128GB

GKE reserves an additional 100 MiB memory on each node for kubelet eviction.

如错误消息所示,扩展集群不会解决问题,因为每个节点的容量限制为 X 内存量,而 POD 需要的不止于此。