在 GKE 的 HPA 中扩展时出错:apiserver 无法写入 JSON 响应:http2:流已关闭

Error scaling up in HPA in GKE: apiserver was unable to write a JSON response: http2: stream closed

遵循 google 在 Google Kubernetes Engine 中部署 HPA 的指南:https://cloud.google.com/kubernetes-engine/docs/tutorials/autoscaling-metrics

并添加正确的权限,因为我在本指南中使用 Workload Identity:https://github.com/GoogleCloudPlatform/k8s-stackdriver/tree/master/custom-metrics-stackdriver-adapter

并添加此处评论的防火墙规则:https://github.com/kubernetes-sigs/prometheus-adapter/issues/134

我被困在 HPA returns 我这个错误的地方:

kubectl describe hpa -n test-namespace
Name:                  my-hpa
Namespace:             test-namespace
Labels:                <none>
Annotations:           <none>
CreationTimestamp:     Tue, 13 Apr 2021 12:47:56 +0200
Reference:             StatefulSet/my-set
Metrics:               ( current / target )
  "my-metric" on pods:  <unknown> / 1
Min replicas:          1
Max replicas:          60
StatefulSet pods:      1 current / 0 desired
Conditions:
  Type           Status  Reason               Message
  ----           ------  ------               -------
  AbleToScale    True    SucceededGetScale    the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetPodsMetric  the HPA was unable to compute the replica count: unable to get metric my-metric: no metrics returned from custom metrics API
Events:
  Type     Reason                        Age                   From                       Message
  ----     ------                        ----                  ----                       -------
  Warning  FailedGetPodsMetric           8m26s (x40 over 18m)  horizontal-pod-autoscaler  unable to get metric my-metric: no metrics returned from custom metrics API
  Warning  FailedComputeMetricsReplicas  3m26s (x53 over 18m)  horizontal-pod-autoscaler  failed to compute desired number of replicas based on listed metrics for StatefulSet/test-namespace/my-set: invalid metrics (1 invalid out of 1), first error is: failed to get pods metric value: unable to get metric my-metric: no metrics returned from custom metrics API

但是apiservices是真的,

kubectl get apiservices
NAME                                     SERVICE                                             AVAILABLE   AGE
...
v1beta1.custom.metrics.k8s.io            custom-metrics/custom-metrics-stackdriver-adapter   True        24h
v1beta1.external.metrics.k8s.io          custom-metrics/custom-metrics-stackdriver-adapter   True        24h
v1beta2.custom.metrics.k8s.io            custom-metrics/custom-metrics-stackdriver-adapter   True        24h
...

当我尝试检索指标数据时 returns 好的,

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta2/namespaces/test-namespace/pods/*/my-metric" | jq .
{
  "kind": "MetricValueList",
  "apiVersion": "custom.metrics.k8s.io/v1beta2",
  "metadata": {
    "selfLink": "/apis/custom.metrics.k8s.io/v1beta2/namespaces/test-namespace/pods/%2A/my-metric"
  },
  "items": [
    {
      "describedObject": {
        "kind": "Pod",
        "namespace": "test-namespace",
        "name": "my-metrics-api-XXXX",
        "apiVersion": "/__internal"
      },
      "metric": {
        "name": "my-metric",
        "selector": null
      },
      "timestamp": "2021-04-13T11:15:30Z",
      "value": "5"
    }
  ]
}

但是 stackdriver 给我这个错误:

2021-04-13T11:01:30.432634Z apiserver was unable to write a JSON response: http2: stream closed
2021-04-13T11:01:30.432679Z apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http2: stream closed"}

我必须像这样配置 google 提供的适配器:

apiVersion: v1
kind: Namespace
metadata:
  name: custom-metrics
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: custom-metrics-stackdriver-adapter
  namespace: custom-metrics
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: custom-metrics:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: custom-metrics-stackdriver-adapter
  namespace: custom-metrics
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: custom-metrics-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: custom-metrics-stackdriver-adapter
  namespace: custom-metrics
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: custom-metrics-resource-reader
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: view
subjects:
- kind: ServiceAccount
  name: custom-metrics-stackdriver-adapter
  namespace: custom-metrics
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: custom-metrics-stackdriver-adapter
  namespace: custom-metrics
  labels:
    run: custom-metrics-stackdriver-adapter
    k8s-app: custom-metrics-stackdriver-adapter
spec:
  replicas: 1
  selector:
    matchLabels:
      run: custom-metrics-stackdriver-adapter
      k8s-app: custom-metrics-stackdriver-adapter
  template:
    metadata:
      labels:
        run: custom-metrics-stackdriver-adapter
        k8s-app: custom-metrics-stackdriver-adapter
        kubernetes.io/cluster-service: "true"
    spec:
      serviceAccountName: custom-metrics-stackdriver-adapter
      containers:
      - image: gcr.io/gke-release/custom-metrics-stackdriver-adapter:v0.12.0-gke.0
        imagePullPolicy: Always
        name: pod-custom-metrics-stackdriver-adapter
        command:
        - /adapter
        - --use-new-resource-model=true
        - --cert-dir=/tmp
        - --secure-port=4443
        resources:
          limits:
            cpu: 250m
            memory: 200Mi
          requests:
            cpu: 250m
            memory: 200Mi
        securityContext:
          runAsNonRoot: true
          runAsUser: 1000
---
apiVersion: v1
kind: Service
metadata:
  labels:
    run: custom-metrics-stackdriver-adapter
    k8s-app: custom-metrics-stackdriver-adapter
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: Adapter
  name: custom-metrics-stackdriver-adapter
  namespace: custom-metrics
spec:
  ports:
  - port: 443
    protocol: TCP
    targetPort: 4443
  selector:
    run: custom-metrics-stackdriver-adapter
    k8s-app: custom-metrics-stackdriver-adapter
  type: ClusterIP
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  name: v1beta1.custom.metrics.k8s.io
spec:
  insecureSkipTLSVerify: true
  group: custom.metrics.k8s.io
  groupPriorityMinimum: 100
  versionPriority: 100
  service:
    name: custom-metrics-stackdriver-adapter
    namespace: custom-metrics
  version: v1beta1
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  name: v1beta2.custom.metrics.k8s.io
spec:
  insecureSkipTLSVerify: true
  group: custom.metrics.k8s.io
  groupPriorityMinimum: 100
  versionPriority: 200
  service:
    name: custom-metrics-stackdriver-adapter
    namespace: custom-metrics
  version: v1beta2
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  name: v1beta1.external.metrics.k8s.io
spec:
  insecureSkipTLSVerify: true
  group: external.metrics.k8s.io
  groupPriorityMinimum: 100
  versionPriority: 100
  service:
    name: custom-metrics-stackdriver-adapter
    namespace: custom-metrics
  version: v1beta1
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: external-metrics-reader
rules:
- apiGroups:
  - "external.metrics.k8s.io"
  resources:
  - "*"
  verbs:
  - list
  - get
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: external-metrics-reader
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: external-metrics-reader
subjects:
- kind: ServiceAccount
  name: horizontal-pod-autoscaler
  namespace: kube-system

因为端口 443 被禁用,我不得不更改为 4443 并添加 --cert-dir=/tmp 选项,因为没有该选项,stackdriver returns 我的错误:

"unable to run custom metrics adapter: error creating self-signed certificates: mkdir apiserver.local.config: permission denied"

我想我已经解释了配置它的所有步骤,但没有成功。有什么想法吗?

帮我解决了!

经过多次测试,修改了HPA yaml,

PodExternal 的指标,以及带有 custom.google.apis/my-metric 的指标名称,有效!

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: my-hpa
  namespace: test-namespace
spec:
  maxReplicas: 60
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: my-set
  metrics:
  - type: External
    external:
      metric: 
        name: custom.googleapis.com|my-metric
      target:
        averageValue: 1
        type: AverageValue