Prometheus 在 GKE 中从 kubernetes api 获得 403 禁止

Prometheus getting 403 forbidden from kubernetes api in GKE

对于 prometheus 部署的 ClusterRole,我有

# ClusterRole for the deployment
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/proxy
  - nodes/metrics
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups:
  - extensions
  resources:
  - ingresses
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]

ServiceAccount 和 ClusterRoleBinding 也已就位。

以下是 prometheus.yml 中出现 403 错误的作业的设置

- job_name: 'kubernetes-cadvisor'

      scheme: https

      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

      kubernetes_sd_configs:
      - role: node

      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes//proxy/metrics/cadvisor

- job_name: 'kubernetes-nodes'

      scheme: https

      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

      kubernetes_sd_configs:
      - role: node

      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: kubernetes.default.svc:443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes//proxy/metrics

我不明白为什么即使 ServiceAccountClusterRole 已经绑定在一起,我仍然收到 403 错误。

确保 /var/run/secrets/kubernetes.io/serviceaccount/token 文件包含正确的标记。为此,您可以使用以下命令进入 Prometheus pod:

kubectl exec -it -n <namespace> <Prometheus_pod_name> -- bash

并 cat 令牌文件。然后退出pod,执行:

echo $(kubectl get secret -n <namespace> <prometheus_serviceaccount_secret> -o jsonpath='{.data.token}') | base64 --decode

如果令牌匹配,您可以尝试使用 Postman 或 Insomnia 查询 Kubernetes API 服务器,看看您在 ClusterRole 中输入的规则是否正确。我建议您同时查询 /proxy/metrics/cadvisor/proxy/metrics 网址