vertical pod autoscaler "auto" 模式有什么用

what is the use of vertical pod autoscaler "auto" mode

据我了解 VPA documentation 垂直 pod 自动缩放器 stop/restart pod 基于预测的 request/limit 的 lower/upper 边界和目标。 在“自动”模式下,它表示 pod 将停止并重新启动,但是,我不明白在 pod 仍在工作时进行预测并重新启动 pod 的意义,因为尽管我们知道它可能会耗尽资源最终它仍在工作,我们可以等到它真正用完 memory/cpu 后再重新调整它。只是等待 pod 退出 memory/cpu 然后用新的预测请求重新启动它不是更有效吗?

从死掉的容器中恢复是否比我们自己停止并重新启动 pod 的成本更高?如果有,是通过什么方式?

Isn't it more efficient to just wait for the pod to go out of memory/cpu and then restart it with the new predicted request?

在我看来,这不是最佳解决方案。如果 Pod 尝试使用的 CPU 多于容器的 CPU 使用限制,如果容器尝试使用超过限制的内存,则 kubernetes OOM 会由于限制过度使用而杀死容器但是npods 的限制通常可以高于节点容量的总和,因此这会导致节点内存耗尽,并可能导致其他 workload/pods.

死亡

回答您的问题 - VPA 旨在简化这些场景:

Vertical Pod Autoscaler (VPA) frees the users from necessity of setting up-to-date resource limits and requests for the containers in their pods. When configured, it will set the requests automatically based on usage and thus allow proper scheduling onto nodes so that appropriate resource amount is available for each pod. It will also maintain ratios between limits and requests that were specified in initial containers configuration.

此外,VPA 不仅要负责放大,还要负责缩小: 它既可以缩小 pods 过度请求资源的规模,也可以扩大 pods 那些根据一段时间内的使用情况而请求不足的资源。

Is recovering from a dead container more costly than stopping and restarting the pod ourselves? If yes, in what ways?

关于从死容器中恢复的成本 - 根据官方 doc.

,主要可能的成本可能是在 OOM 终止过程中最终丢失的请求

根据 official documentation VPA 在这些模式下运行:

"Auto": VPA assigns resource requests on pod creation as well as updates them on existing pods using the preferred update mechanism  Currently this is equivalent to "Recrete".

"Recreate": VPA assigns resource requests on pod creation as well as updates them on existing pods by evicting them when the requested resources differ significantly from the new recommendation (respecting the Pod Disruption Budget, if defined).

"Initial": VPA only assigns resource requests on pod creation and never changes them later.

"Off": VPA does not automatically change resource requirements of the pods.

注意: VPA 限制

  • VPA 建议可能会超出可用资源,例如您的集群容量或您团队的配额。可用资源不足可能会导致 pods 挂起。
  • 自动或重新创建模式下的 VPA 不会逐出 pods 一个副本,因为这会导致中断。
  • 快速内存增长可能导致容器内存不足而被杀死。由于内存不足杀死 pods 没有重新安排,VPA 不会申请新资源。

也请看一下 VPA Known limitations:

  • Updating running pods is an experimental feature of VPA. Whenever VPA updates the pod resources the pod is recreated, which causes all running containers to be restarted. The pod may be recreated on a different node.
  • VPA does not evict pods which are not run under a controller. For such pods Auto mode is currently equivalent to Initial.
  • VPA reacts to most out-of-memory events, but not in all situations.

其他资源: VERTICAL POD AUTOSCALING: THE DEFINITIVE GUIDE