如何使用 scale up/down 策略实现 Kubernetes 水平 pod 自动缩放?
How to implement Kubernetes horizontal pod autoscaling with scale up/down policies?
AWS EKS 中的 Kubernetes v1.19
我正在尝试在我的 EKS 集群中实施水平 pod 自动缩放,并试图模仿我们现在使用 ECS 所做的事情。对于ECS,我们做类似下面的事情
- 在连续 3 个 1 分钟采样周期后 CPU >= 90% 时扩大规模
- 当 CPU <= 5 个连续的 1 分钟采样周期后 <= 60% 时缩小
- 在 3 个连续的 1 分钟采样周期后内存 >= 85% 时扩大规模
- 在 5 个连续的 1 分钟采样周期后内存 <= 70% 时缩小
我正在尝试使用 HorizontalPodAutoscaler
类型,helm create
给了我这个模板。 (请注意,我修改了它以满足我的需要,但 metrics
节仍然存在。)
{- if .Values.autoscaling.enabled }}
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: {{ include "microserviceChart.Name" . }}
labels:
{{- include "microserviceChart.Name" . | nindent 4 }}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ include "microserviceChart.Name" . }}
minReplicas: {{ include "microserviceChart.minReplicas" . }}
maxReplicas: {{ include "microserviceChart.maxReplicas" . }}
metrics:
{{- if .Values.autoscaling.targetCPUUtilizationPercentage }}
- type: Resource
resource:
name: cpu
targetAverageUtilization: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}
{{- end }}
{{- if .Values.autoscaling.targetMemoryUtilizationPercentage }}
- type: Resource
resource:
name: memory
targetAverageUtilization: {{ .Values.autoscaling.targetMemoryUtilizationPercentage }}
{{- end }}
{{- end }}
但是,如何在上面的模板中匹配 Horizontal Pod Autoscaling 中显示的比例 up/down 信息,以匹配我想要的行为?
Horizontal Pod Autoscaler 根据观察到的指标(如 CPU
或 Memory
)自动缩放复制控制器、部署、副本集或有状态集中 Pods 的数量。
官方演练着重于 HPA
及其缩放:
缩放副本数量的算法如下:
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
可以使用如下所示的 YAML
清单实现(已经呈现的)自动缩放示例:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: HPA-NAME
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: DEPLOYMENT-NAME
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 75
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 75
A side note!
HPA
will use calculate both metrics and chose the one with bigger desiredReplicas
!
解决我在问题下写的评论:
I think we misunderstood each other. It's perfectly okay to "scale up when CPU >= 90" but due to logic behind the formula I don't think it will be possible to say "scale down when CPU <=70". According to the formula it would be something in the midst of: scale up when CPU >= 90 and scale down when CPU =< 45.
这个例子可能会产生误导,并不是在所有情况下都是 100% 正确的。看看下面的例子:
HPA
设置为 75%
的 averageUtilization
。
具有一定程度近似值的快速计算(HPA
的默认公差为 0.1
):
2
个副本:
scale-up
(根据 1
)应该在以下情况发生:currentMetricValue
>=80%
:
x = ceil[2 * (80/75)]
、x = ceil[2,1(3)]
、x = 3
scale-down
(根据 1
)应该在 currentMetricValue
为 <=33%
时发生:
x = ceil[2 * (33/75)]
、x = ceil[0,88]
、x = 1
8
副本:
scale-up
(根据 1
)应该在 currentMetricValue
为 >=76%
时发生:
x = ceil[8 * (76/75)]
、x = ceil[8,10(6)]
、x = 9
scale-down
(根据 1
)应该在 currentMetricValue
为 <=64%
时发生:
x = ceil[8 * (64/75)]
、x = ceil[6,82(6)]
、x = 7
按照这个例子,8
个副本 currentMetricValue
在 55
(desiredMetricValue
设置为 75
)应该 scale-down
6
个副本。
有关 HPA
决策制定的更多信息(例如为什么它不能扩展 )可以通过 运行 找到:
$ kubectl describe hpa HPA-NAME
Name: nginx-scaler
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Sun, 07 Mar 2021 22:48:58 +0100
Reference: Deployment/nginx-scaling
Metrics: ( current / target )
resource memory on pods (as a percentage of request): 5% (61903667200m) / 75%
resource cpu on pods (as a percentage of request): 79% (199m) / 75%
Min replicas: 1
Max replicas: 10
Deployment pods: 5 current / 5 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 4m48s (x4 over 5m3s) horizontal-pod-autoscaler did not receive metrics for any ready pods
Normal SuccessfulRescale 103s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 71s horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 71s horizontal-pod-autoscaler New size: 5; reason: cpu resource utilization (percentage of request) above target
HPA
缩放程序可以通过 Kubernetes 版本 1.18
和更新版本中引入的更改进行修改,其中:
Support for configurable scaling behavior
Starting from v1.18 the v2beta2
API allows scaling behavior to be configured through the HPA behavior
field. Behaviors are specified separately for scaling up and down in scaleUp
or scaleDown
section under the behavior
field. A stabilization window can be specified for both directions which prevents the flapping of the number of the replicas in the scaling target. Similarly specifying scaling policies controls the rate of change of replicas while scaling.
我认为您可以使用 behavior
和 stabilizationWindowSeconds
等新引入的字段来根据您的特定需求调整工作量。
我也确实建议联系 EKS
文档以获取更多参考、对指标和示例的支持。
AWS EKS 中的 Kubernetes v1.19
我正在尝试在我的 EKS 集群中实施水平 pod 自动缩放,并试图模仿我们现在使用 ECS 所做的事情。对于ECS,我们做类似下面的事情
- 在连续 3 个 1 分钟采样周期后 CPU >= 90% 时扩大规模
- 当 CPU <= 5 个连续的 1 分钟采样周期后 <= 60% 时缩小
- 在 3 个连续的 1 分钟采样周期后内存 >= 85% 时扩大规模
- 在 5 个连续的 1 分钟采样周期后内存 <= 70% 时缩小
我正在尝试使用 HorizontalPodAutoscaler
类型,helm create
给了我这个模板。 (请注意,我修改了它以满足我的需要,但 metrics
节仍然存在。)
{- if .Values.autoscaling.enabled }}
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: {{ include "microserviceChart.Name" . }}
labels:
{{- include "microserviceChart.Name" . | nindent 4 }}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ include "microserviceChart.Name" . }}
minReplicas: {{ include "microserviceChart.minReplicas" . }}
maxReplicas: {{ include "microserviceChart.maxReplicas" . }}
metrics:
{{- if .Values.autoscaling.targetCPUUtilizationPercentage }}
- type: Resource
resource:
name: cpu
targetAverageUtilization: {{ .Values.autoscaling.targetCPUUtilizationPercentage }}
{{- end }}
{{- if .Values.autoscaling.targetMemoryUtilizationPercentage }}
- type: Resource
resource:
name: memory
targetAverageUtilization: {{ .Values.autoscaling.targetMemoryUtilizationPercentage }}
{{- end }}
{{- end }}
但是,如何在上面的模板中匹配 Horizontal Pod Autoscaling 中显示的比例 up/down 信息,以匹配我想要的行为?
Horizontal Pod Autoscaler 根据观察到的指标(如 CPU
或 Memory
)自动缩放复制控制器、部署、副本集或有状态集中 Pods 的数量。
官方演练着重于 HPA
及其缩放:
缩放副本数量的算法如下:
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
可以使用如下所示的 YAML
清单实现(已经呈现的)自动缩放示例:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: HPA-NAME
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: DEPLOYMENT-NAME
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 75
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 75
A side note!
HPA
will use calculate both metrics and chose the one with biggerdesiredReplicas
!
解决我在问题下写的评论:
I think we misunderstood each other. It's perfectly okay to "scale up when CPU >= 90" but due to logic behind the formula I don't think it will be possible to say "scale down when CPU <=70". According to the formula it would be something in the midst of: scale up when CPU >= 90 and scale down when CPU =< 45.
这个例子可能会产生误导,并不是在所有情况下都是 100% 正确的。看看下面的例子:
HPA
设置为75%
的averageUtilization
。
具有一定程度近似值的快速计算(HPA
的默认公差为 0.1
):
2
个副本:scale-up
(根据1
)应该在以下情况发生:currentMetricValue
>=80%
:x = ceil[2 * (80/75)]
、x = ceil[2,1(3)]
、x = 3
scale-down
(根据1
)应该在currentMetricValue
为 <=33%
时发生:x = ceil[2 * (33/75)]
、x = ceil[0,88]
、x = 1
8
副本:scale-up
(根据1
)应该在currentMetricValue
为 >=76%
时发生:x = ceil[8 * (76/75)]
、x = ceil[8,10(6)]
、x = 9
scale-down
(根据1
)应该在currentMetricValue
为 <=64%
时发生:x = ceil[8 * (64/75)]
、x = ceil[6,82(6)]
、x = 7
按照这个例子,8
个副本 currentMetricValue
在 55
(desiredMetricValue
设置为 75
)应该 scale-down
6
个副本。
有关 HPA
决策制定的更多信息(例如为什么它不能扩展 )可以通过 运行 找到:
$ kubectl describe hpa HPA-NAME
Name: nginx-scaler
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Sun, 07 Mar 2021 22:48:58 +0100
Reference: Deployment/nginx-scaling
Metrics: ( current / target )
resource memory on pods (as a percentage of request): 5% (61903667200m) / 75%
resource cpu on pods (as a percentage of request): 79% (199m) / 75%
Min replicas: 1
Max replicas: 10
Deployment pods: 5 current / 5 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 4m48s (x4 over 5m3s) horizontal-pod-autoscaler did not receive metrics for any ready pods
Normal SuccessfulRescale 103s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 71s horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
Normal SuccessfulRescale 71s horizontal-pod-autoscaler New size: 5; reason: cpu resource utilization (percentage of request) above target
HPA
缩放程序可以通过 Kubernetes 版本 1.18
和更新版本中引入的更改进行修改,其中:
Support for configurable scaling behavior
Starting from v1.18 the
v2beta2
API allows scaling behavior to be configured through the HPAbehavior
field. Behaviors are specified separately for scaling up and down inscaleUp
orscaleDown
section under thebehavior
field. A stabilization window can be specified for both directions which prevents the flapping of the number of the replicas in the scaling target. Similarly specifying scaling policies controls the rate of change of replicas while scaling.
我认为您可以使用 behavior
和 stabilizationWindowSeconds
等新引入的字段来根据您的特定需求调整工作量。
我也确实建议联系 EKS
文档以获取更多参考、对指标和示例的支持。