为什么我的 GKE 节点池没有自动缩容?

Why my GKE node pool does not auto-scale down?

我有一个明显未充分利用的可抢占节点池:

节点池使用 HPA 托管 部署,设置如下:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: backend
  labels:
    app: backend
spec:
  replicas: 1
  selector:
    matchLabels:
      app: backend
  template:
    metadata:
      labels:
        app: backend
    spec:
      initContainers:
      - name: wait-for-database
        image: ### IMAGE ###
        command: ['bash', 'init.sh']
      containers:
      - name: backend
        image: ### IMAGE ###
        command: ["bash", "entrypoint.sh"]
        imagePullPolicy: Always
        resources:
          requests:
            memory: "200M"
            cpu: "50m"
        ports:
        - name: probe-port
          containerPort: 8080
          hostPort: 8080
        volumeMounts:
          - name: static-shared-data
            mountPath: /static
        readinessProbe:
          httpGet:
            path: /readiness/
            port: probe-port
          failureThreshold: 5
          initialDelaySeconds: 10
          periodSeconds: 10
          timeoutSeconds: 5
      - name: nginx
        image: nginx:alpine
        resources:
          requests:
            memory: "400M"
            cpu: "20m"
        ports:
        - containerPort: 80
        volumeMounts:
        - name: nginx-proxy-config
          mountPath: /etc/nginx/conf.d/default.conf
          subPath: app.conf
        - name: static-shared-data
          mountPath: /static
      volumes:
      - name: nginx-proxy-config
        configMap:
          name: backend-nginx
      - name: static-shared-data
        emptyDir: {}
      nodeSelector:
        cloud.google.com/gke-nodepool: app-dev
      tolerations:
      - effect: NoSchedule
        key: workload
        operator: Equal
        value: dev
---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: backend
  namespace: default
spec:
  maxReplicas: 12
  minReplicas: 8
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: backend
  metrics:
  - resource:
      name: cpu
      targetAverageUtilization: 50
    type: Resource
---

节点池也有容忍标签。

HPA 利用率显示如下:

NAME              REFERENCE                    TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
backend-develop   Deployment/backend-develop   10%/50%   8         12        8          38d

但是节点池一天左右没有缩容。此部署没有重负载:

NAME                             STATUS   ROLES    AGE     VERSION
gke-dev-app-dev-fee1a901-fvw9    Ready    <none>   22h     v1.14.10-gke.36
gke-dev-app-dev-fee1a901-gls7    Ready    <none>   22h     v1.14.10-gke.36
gke-dev-app-dev-fee1a901-lf3f    Ready    <none>   24h     v1.14.10-gke.36
gke-dev-app-dev-fee1a901-lgw9    Ready    <none>   3d10h   v1.14.10-gke.36
gke-dev-app-dev-fee1a901-qxkz    Ready    <none>   3h35m   v1.14.10-gke.36
gke-dev-app-dev-fee1a901-s10l    Ready    <none>   22h     v1.14.10-gke.36
gke-dev-app-dev-fee1a901-sj4d    Ready    <none>   22h     v1.14.10-gke.36
gke-dev-app-dev-fee1a901-vdnw    Ready    <none>   27h     v1.14.10-gke.36

此部署和节点池没有 affinity 设置。一些节点很容易打包几个相同的 pods,但其他节点只持有一个 pod 数小时,不会缩小

有什么问题吗?

问题是:

hostPort: 8080

这导致 FailedScheduling didn't have free ports。 这就是节点保持在线的原因。