为什么 AKS 节点在其实际内存仍然可用的情况下显示较少的可分配内存量
Why AKS nodes shows less amount of memory as allocatable where its actual memory is still available
我想知道 AKS 节点考虑保留内存的因素以及它如何计算可分配内存。
在我的集群中,我们有多个节点(2 CPU,7 GB RAM)。
我观察到的是所有节点(18 岁以上)仅显示 7 GB 中的 4 GB 可分配内存。因此,我们的集群具有用于新部署的资源连接。因此,我们必须相应地增加节点数以满足资源需求。
已更新
正如我在下面评论的那样,在下面添加了 kubectl top node 命令。奇怪的是,节点消耗百分比怎么会超过 100%。
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
aks-nodepool1-xxxxxxxxx-vmssxxxx00 265m 13% 2429Mi 53%
aks-nodepool1-xxxxxxxxx-vmssxxxx01 239m 12% 3283Mi 71%
aks-nodepool1-xxxxxxxxx-vmssxxxx0g 465m 24% 4987Mi 109%
aks-nodepool2-xxxxxxxxx-vmssxxxx8i 64m 3% 3085Mi 67%
aks-nodepool2-xxxxxxxxx-vmssxxxx8p 114m 6% 5320Mi 116%
aks-nodepool2-xxxxxxxxx-vmssxxxx9n 105m 5% 2715Mi 59%
aks-nodepool2-xxxxxxxxx-vmssxxxxaa 134m 7% 5216Mi 114%
aks-nodepool2-xxxxxxxxx-vmssxxxxat 179m 9% 5498Mi 120%
aks-nodepool2-xxxxxxxxx-vmssxxxxaz 141m 7% 4769Mi 104%
aks-nodepool2-xxxxxxxxx-vmssxxxxb0 72m 3% 1972Mi 43%
aks-nodepool2-xxxxxxxxx-vmssxxxxb1 133m 7% 3684Mi 80%
aks-nodepool2-xxxxxxxxx-vmssxxxxb3 182m 9% 5294Mi 115%
aks-nodepool2-xxxxxxxxx-vmssxxxxb4 133m 7% 5009Mi 109%
aks-nodepool2-xxxxxxxxx-vmssxxxxbj 68m 3% 1783Mi 39%
所以这里我以aks-nodepool2-xxxxxxxxx-vmssxxxx8p 114m 6% 5320Mi 116% node为例
我计算了该节点中每个 pod 的内存使用量,总计约为 4.1 GB,节点可分配内存为实际 7GB 中的 4.6GB。
此处“为什么顶级节点”输出与该节点中的每个 pods“顶级 pods 输出”不同?
预期百分比 == 4.1GB/4.6GB== 93%
但是 top node 命令给出的输出为 116%
这是 AKS 保持群集安全和正常运行的预期行为。
当您在 AKS 中创建 k8s 集群时,并不意味着您将获得您的 VM 所拥有的所有 Memory/CPU。根据集群配置,它消耗的资源甚至可能比您共享的还要多。例如如果您启用 OMS 代理以获取 AKS 的见解,它也会保留一些容量。
来自官方文档,Kubernetes core concepts for Azure Kubernetes Service (AKS) --> Resource reservations. For associated best practices, see Best practices for basic scheduler features in AKS。
AKS uses node resources to help the node function as part of your cluster. This usage can create a discrepancy between your node's total resources and the allocatable resources in AKS. Remember this information when setting requests and limits for user deployed pods.
To find a node's allocatable resources, run:
kubectl describe node [NODE_NAME]
To maintain node performance and functionality, AKS reserves resources on each node. As a node grows larger in resources, the resource reservation grows due to a higher need for management of user-deployed pods.
Two types of resources are reserved:
- CPU
Reserved CPU is dependent on node type and cluster configuration, which may cause less allocatable CPU due to running additional features.
- Memory
Memory utilized by AKS includes the sum of two values.
- kubelet daemon
The kubelet daemon is installed on all Kubernetes agent nodes to manage container creation and termination.
By default on AKS, kubelet daemon has the memory.available<750Mi eviction rule, ensuring a node must always have at least 750 Mi allocatable at all times. When a host is below that available memory threshold, the kubelet will trigger to terminate one of the running pods and free up memory on the host machine.
- A regressive rate of memory reservations for the kubelet daemon to properly function (kube-reserved).
25% of the first 4 GB of memory
20% of the next 4 GB of memory (up to 8 GB)
10% of the next 8 GB of memory (up to 16 GB)
6% of the next 112 GB of memory (up to 128 GB)
2% of any memory above 128 GB
Memory and CPU allocation rules:
- Keep agent nodes healthy, including some hosting system pods critical to cluster health.
- Cause the node to report less allocatable memory and CPU than it would if it were not part of a Kubernetes cluster.
The above resource reservations can't be changed.
For example, if a node offers 7 GB, it will report 34% of memory not allocatable including the 750Mi hard eviction threshold.
0.75 + (0.25*4) + (0.20*3) = 0.75GB + 1GB + 0.6GB = 2.35GB / 7GB = 33.57% reserved
In addition to reservations for Kubernetes itself, the underlying node OS also reserves an amount of CPU and memory resources to maintain OS functions.
我想知道 AKS 节点考虑保留内存的因素以及它如何计算可分配内存。
在我的集群中,我们有多个节点(2 CPU,7 GB RAM)。
我观察到的是所有节点(18 岁以上)仅显示 7 GB 中的 4 GB 可分配内存。因此,我们的集群具有用于新部署的资源连接。因此,我们必须相应地增加节点数以满足资源需求。
已更新 正如我在下面评论的那样,在下面添加了 kubectl top node 命令。奇怪的是,节点消耗百分比怎么会超过 100%。
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
aks-nodepool1-xxxxxxxxx-vmssxxxx00 265m 13% 2429Mi 53%
aks-nodepool1-xxxxxxxxx-vmssxxxx01 239m 12% 3283Mi 71%
aks-nodepool1-xxxxxxxxx-vmssxxxx0g 465m 24% 4987Mi 109%
aks-nodepool2-xxxxxxxxx-vmssxxxx8i 64m 3% 3085Mi 67%
aks-nodepool2-xxxxxxxxx-vmssxxxx8p 114m 6% 5320Mi 116%
aks-nodepool2-xxxxxxxxx-vmssxxxx9n 105m 5% 2715Mi 59%
aks-nodepool2-xxxxxxxxx-vmssxxxxaa 134m 7% 5216Mi 114%
aks-nodepool2-xxxxxxxxx-vmssxxxxat 179m 9% 5498Mi 120%
aks-nodepool2-xxxxxxxxx-vmssxxxxaz 141m 7% 4769Mi 104%
aks-nodepool2-xxxxxxxxx-vmssxxxxb0 72m 3% 1972Mi 43%
aks-nodepool2-xxxxxxxxx-vmssxxxxb1 133m 7% 3684Mi 80%
aks-nodepool2-xxxxxxxxx-vmssxxxxb3 182m 9% 5294Mi 115%
aks-nodepool2-xxxxxxxxx-vmssxxxxb4 133m 7% 5009Mi 109%
aks-nodepool2-xxxxxxxxx-vmssxxxxbj 68m 3% 1783Mi 39%
所以这里我以aks-nodepool2-xxxxxxxxx-vmssxxxx8p 114m 6% 5320Mi 116% node为例
我计算了该节点中每个 pod 的内存使用量,总计约为 4.1 GB,节点可分配内存为实际 7GB 中的 4.6GB。
此处“为什么顶级节点”输出与该节点中的每个 pods“顶级 pods 输出”不同?
预期百分比 == 4.1GB/4.6GB== 93% 但是 top node 命令给出的输出为 116%
这是 AKS 保持群集安全和正常运行的预期行为。
当您在 AKS 中创建 k8s 集群时,并不意味着您将获得您的 VM 所拥有的所有 Memory/CPU。根据集群配置,它消耗的资源甚至可能比您共享的还要多。例如如果您启用 OMS 代理以获取 AKS 的见解,它也会保留一些容量。
来自官方文档,Kubernetes core concepts for Azure Kubernetes Service (AKS) --> Resource reservations. For associated best practices, see Best practices for basic scheduler features in AKS。
AKS uses node resources to help the node function as part of your cluster. This usage can create a discrepancy between your node's total resources and the allocatable resources in AKS. Remember this information when setting requests and limits for user deployed pods.
To find a node's allocatable resources, run:
kubectl describe node [NODE_NAME]
To maintain node performance and functionality, AKS reserves resources on each node. As a node grows larger in resources, the resource reservation grows due to a higher need for management of user-deployed pods.
Two types of resources are reserved:
- CPU
Reserved CPU is dependent on node type and cluster configuration, which may cause less allocatable CPU due to running additional features.
- Memory
Memory utilized by AKS includes the sum of two values.
- kubelet daemon
The kubelet daemon is installed on all Kubernetes agent nodes to manage container creation and termination.
By default on AKS, kubelet daemon has the memory.available<750Mi eviction rule, ensuring a node must always have at least 750 Mi allocatable at all times. When a host is below that available memory threshold, the kubelet will trigger to terminate one of the running pods and free up memory on the host machine.
- A regressive rate of memory reservations for the kubelet daemon to properly function (kube-reserved).
25% of the first 4 GB of memory
20% of the next 4 GB of memory (up to 8 GB)
10% of the next 8 GB of memory (up to 16 GB)
6% of the next 112 GB of memory (up to 128 GB)
2% of any memory above 128 GB
Memory and CPU allocation rules:
- Keep agent nodes healthy, including some hosting system pods critical to cluster health.
- Cause the node to report less allocatable memory and CPU than it would if it were not part of a Kubernetes cluster.
The above resource reservations can't be changed.
For example, if a node offers 7 GB, it will report 34% of memory not allocatable including the 750Mi hard eviction threshold.
0.75 + (0.25*4) + (0.20*3) = 0.75GB + 1GB + 0.6GB = 2.35GB / 7GB = 33.57% reserved
In addition to reservations for Kubernetes itself, the underlying node OS also reserves an amount of CPU and memory resources to maintain OS functions.