Kubernetes horizo​​ntal pod autoscaler 不根据副本数创建副本

Kubernetes horizontal pod autoscaler not creating replicas according to replica count

我在这里尝试通过 kubernetes 自定义集群(通过 kubeadm 创建)中的 helm chart 部署一个 dockerized web 服务。所以当它自动缩放时,它不会根据副本计数创建副本。

这是我的部署文件。

apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: {{ template "demochart.fullname" . }}
  labels:
    app: {{ template "demochart.name" . }}
    chart: {{ template "demochart.chart" . }}
    release: {{ .Release.Name }}
    heritage: {{ .Release.Service }}
spec:
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      app: {{ template "demochart.name" . }}
      release: {{ .Release.Name }}
  template:
    metadata:
      labels:
        app: {{ template "demochart.name" . }}
        release: {{ .Release.Name }}
    spec:
      containers:
        - name: {{ .Chart.Name }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          ports:
            - name: http
              containerPort: 80
          volumeMounts:
            - name: cred-storage
              mountPath: /root/
          resources:
{{ toYaml .Values.resources | indent 12 }}
    {{- with .Values.nodeSelector }}
      nodeSelector:
{{ toYaml . | indent 8 }}
    {{- end }}
    {{- with .Values.affinity }}
      affinity:
{{ toYaml . | indent 8 }}
    {{- end }}
    {{- with .Values.tolerations }}
      tolerations:
{{ toYaml . | indent 8 }}
    {{- end }}
      volumes:
        - name: cred-storage
          hostPath:
            path: /home/aodev/
            type:

这是values.yaml

replicaCount: 3

image:
  repository: REPO_NAME
  tag: latest
  pullPolicy: IfNotPresent

service:
  type: NodePort
  port: 8007

ingress:
  enabled: false
  annotations: {}
    # kubernetes.io/ingress.class: nginx
    # kubernetes.io/tls-acme: "true"
  path: /
  hosts:
    - chart-example.local
  tls: []
  #  - secretName: chart-example-tls
  #    hosts:
  #      - chart-example.local

resources: 
  # We usually recommend not to specify default resources and to leave this as a conscious
  # choice for the user. This also increases chances charts run on environments with little
  # resources, such as Minikube. If you do want to specify resources, uncomment the following
  # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
  limits:
    cpu: 1000m
    memory: 2000Mi
  requests:
    cpu: 1000m
    memory: 2000Mi

nodeSelector: {}

tolerations: []

affinity: {}

这是我的 运行 pods,其中包括 heapster 和指标服务器以及我的网络服务。

kubectl get pods before autoscaling

下面是hpa文件

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
 annotations:
 name: entitydetection
 namespace: kube-system
spec:
  maxReplicas: 20
  minReplicas: 5
  scaleTargetRef:
    apiVersion: apps/v1beta2
    kind: Deployment
    name: entitydetection
  targetCPUUtilizationPercentage: 50

所以我在部署中将副本数设为 3,minReplicas 设为 5,maxReplicas 设为 20,目标 CPUhpa 中的利用率设为 50%。因此,当 cpu 利用率超过 50% 时,它会随机创建副本,而不是根据副本计数。

因此,当 CPU 超过 50% 且年龄为 36 岁时,会创建以下 2 个副本。理想情况下应该创建 3 个副本。问题是什么?

kubectl get pods after autoscaling

这是 HPA 设计中的引用 documentation:

The autoscaler is implemented as a control loop. It periodically queries pods described by Status.PodSelector of Scale subresource, and collects their CPU utilization.

Then, it compares the arithmetic mean of the pods' CPU utilization with the target defined in Spec.CPUUtilization, and adjusts the replicas of the Scale if needed to match the target (preserving condition: MinReplicas <= Replicas <= MaxReplicas).

CPU utilization is the recent CPU usage of a pod (average across the last 1 minute) divided by the CPU requested by the pod.

The target number of pods is calculated from the following formula:

TargetNumOfPods = ceil(sum(CurrentPodsCPUUtilization) / Target)    

Starting and stopping pods may introduce noise to the metric (for instance, starting may temporarily increase CPU). So, after each action, the autoscaler should wait some time for reliable data. Scale-up can only happen if there was no rescaling within the last 3 minutes. Scale-down will wait for 5 minutes from the last rescaling.

因此,HPA 生成的最小数量 pods 可以解决当前负载。