为什么 Kubernetes HPA 不缩减(内存)?
Why is Kubernetes HPA scaling not down (Memory)?
总结
在我们的 Kubernetes 集群中,我们引入了具有内存和 cpu 限制的 HPA。现在我们不明白为什么我们有一个服务的 2 个副本。
有问题的服务使用了 57% / 85% 的内存并且有 2 个副本而不是一个。我们认为这是因为当你将两个 pods 的内存加起来时它超过了 85% 但如果只有一个 pod 就不会了。那么这是在阻止它缩小规模吗?我们可以在这里做什么?
我们还观察到部署服务时内存使用量达到峰值。我们在 aks (azure) 中使用 spring-boot 服务,并认为它可能会在那里扩展并且永远不会缩小。我们是否遗漏了什么或有任何建议?
头盔
HPA:
{{- $fullName := include "app.fullname" . -}}
{{- $ := include "app.fullname" . -}}
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: {{ $fullName }}-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ include "app.name" . }}
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 50
- type: Resource
resource:
name: memory
targetAverageUtilization: 85
并在部署中:
# Horizontal-Pod-Auto-Scaler
resources:
requests:
memory: {{ $requestedMemory }}
cpu: {{ $requesteCpu }}
limits:
memory: {{ $limitMemory }}
cpu: {{ $limitCpu }}
服务默认值:
hpa:
resources:
request:
memory: 500Mi
cpu: 300m
limits:
memory: 1000Mi
cpu: 999m
kubectl get hpa -n dev
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
xxxxxxxx-load-for-cluster-hpa Deployment/xxxxxxxx-load-for-cluster 34%/85%, 0%/50% 1 10 1 4d7h
xxx5-ccg-hpa Deployment/xxx5-ccg 58%/85%, 0%/50% 1 10 1 4d12h
iotbootstrapping-service-hpa Deployment/iotbootstrapping-service 54%/85%, 0%/50% 1 10 1 4d12h
mocks-hpa Deployment/mocks 41%/85%, 0%/50% 1 10 1 4d12h
user-pairing-service-hpa Deployment/user-pairing-service 41%/85%, 0%/50% 1 10 1 4d12h
aaa-registration-service-hpa Deployment/aaa-registration-service 57%/85%, 0%/50% 1 10 2 4d12h
webshop-purchase-service-hpa Deployment/webshop-purchase-service 41%/85%, 0%/50% 1 10 1 4d12h
kubectl describe hpa -n dev
Name: xxx-registration-service-hpa
Namespace: dev
Labels: app.kubernetes.io/managed-by=Helm
Annotations: meta.helm.sh/release-name: vwg-registration-service
meta.helm.sh/release-namespace: dev
CreationTimestamp: Thu, 18 Jun 2020 22:50:27 +0200
Reference: Deployment/xxx-registration-service
Metrics: ( current / target )
resource memory on pods (as a percentage of request): 57% (303589376) / 85%
resource cpu on pods (as a percentage of request): 0% (1m) / 50%
Min replicas: 1
Max replicas: 10
Deployment pods: 2 current / 2 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events: <none>
如果需要任何进一步的信息,请随时询问!
非常感谢您抽出宝贵时间!
干杯
罗宾
The formula for determining the desired replica count 是:
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
你的问题的重要部分是 ceil[...]
函数包装器:它总是四舍五入到下一个最近的副本。如果 currentReplicas
为 2 且 desiredMetricValue
为 85%,则 currentMetricValue
必须为 42.5% 或更低才能触发缩小。
在你的例子中,currentMetricValue
是 57%,所以你得到
desiredReplicas = ceil[2 * (57 / 85)]
= ceil[2 * 0.671]
= ceil[1.341]
= 2
你是对的,如果 currentReplicas
是 1,HPA 也不会觉得需要扩大;实际利用率需要攀升至 85% 以上才能触发它。
总结
在我们的 Kubernetes 集群中,我们引入了具有内存和 cpu 限制的 HPA。现在我们不明白为什么我们有一个服务的 2 个副本。
有问题的服务使用了 57% / 85% 的内存并且有 2 个副本而不是一个。我们认为这是因为当你将两个 pods 的内存加起来时它超过了 85% 但如果只有一个 pod 就不会了。那么这是在阻止它缩小规模吗?我们可以在这里做什么?
我们还观察到部署服务时内存使用量达到峰值。我们在 aks (azure) 中使用 spring-boot 服务,并认为它可能会在那里扩展并且永远不会缩小。我们是否遗漏了什么或有任何建议?
头盔
HPA:
{{- $fullName := include "app.fullname" . -}}
{{- $ := include "app.fullname" . -}}
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: {{ $fullName }}-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ include "app.name" . }}
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 50
- type: Resource
resource:
name: memory
targetAverageUtilization: 85
并在部署中:
# Horizontal-Pod-Auto-Scaler
resources:
requests:
memory: {{ $requestedMemory }}
cpu: {{ $requesteCpu }}
limits:
memory: {{ $limitMemory }}
cpu: {{ $limitCpu }}
服务默认值:
hpa:
resources:
request:
memory: 500Mi
cpu: 300m
limits:
memory: 1000Mi
cpu: 999m
kubectl get hpa -n dev
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
xxxxxxxx-load-for-cluster-hpa Deployment/xxxxxxxx-load-for-cluster 34%/85%, 0%/50% 1 10 1 4d7h
xxx5-ccg-hpa Deployment/xxx5-ccg 58%/85%, 0%/50% 1 10 1 4d12h
iotbootstrapping-service-hpa Deployment/iotbootstrapping-service 54%/85%, 0%/50% 1 10 1 4d12h
mocks-hpa Deployment/mocks 41%/85%, 0%/50% 1 10 1 4d12h
user-pairing-service-hpa Deployment/user-pairing-service 41%/85%, 0%/50% 1 10 1 4d12h
aaa-registration-service-hpa Deployment/aaa-registration-service 57%/85%, 0%/50% 1 10 2 4d12h
webshop-purchase-service-hpa Deployment/webshop-purchase-service 41%/85%, 0%/50% 1 10 1 4d12h
kubectl describe hpa -n dev
Name: xxx-registration-service-hpa
Namespace: dev
Labels: app.kubernetes.io/managed-by=Helm
Annotations: meta.helm.sh/release-name: vwg-registration-service
meta.helm.sh/release-namespace: dev
CreationTimestamp: Thu, 18 Jun 2020 22:50:27 +0200
Reference: Deployment/xxx-registration-service
Metrics: ( current / target )
resource memory on pods (as a percentage of request): 57% (303589376) / 85%
resource cpu on pods (as a percentage of request): 0% (1m) / 50%
Min replicas: 1
Max replicas: 10
Deployment pods: 2 current / 2 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True ReadyForNewScale recommended size matches current size
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events: <none>
如果需要任何进一步的信息,请随时询问!
非常感谢您抽出宝贵时间!
干杯 罗宾
The formula for determining the desired replica count 是:
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
你的问题的重要部分是 ceil[...]
函数包装器:它总是四舍五入到下一个最近的副本。如果 currentReplicas
为 2 且 desiredMetricValue
为 85%,则 currentMetricValue
必须为 42.5% 或更低才能触发缩小。
在你的例子中,currentMetricValue
是 57%,所以你得到
desiredReplicas = ceil[2 * (57 / 85)]
= ceil[2 * 0.671]
= ceil[1.341]
= 2
你是对的,如果 currentReplicas
是 1,HPA 也不会觉得需要扩大;实际利用率需要攀升至 85% 以上才能触发它。