kube-prometheus-stack 问题抓取指标

kube-prometheus-stack issue scraping metrics

一般集群信息:

我正在尝试让 kube-prometheus-stack helm chart 正常工作。这似乎对大多数目标都有效,但是,一些目标保持关闭状态,如下面的屏幕截图所示。

有什么建议吗,如何让 kube-etcdkube-controller-managerkube-scheduler 受到 Prometheus 的监控?

我如前所述 here and applied the suggestion here 部署了 helm chart 以获取 Prometheus 监控的 kube-proxy。

在此先感谢您的帮助!

编辑 1:

- job_name: monitoring/my-stack-kube-prometheus-s-kube-controller-manager/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: kube-prometheus-stack-kube-controller-manager
    replacement: 
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release]
    separator: ;
    regex: my-stack
    replacement: 
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-metrics
    replacement: 
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_service_label_jobLabel]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: 
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-metrics
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: 
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: 
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
- job_name: monitoring/my-stack-kube-prometheus-s-kube-etcd/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: kube-prometheus-stack-kube-etcd
    replacement: 
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release]
    separator: ;
    regex: my-stack
    replacement: 
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-metrics
    replacement: 
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_service_label_jobLabel]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: 
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-metrics
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: 
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: 
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
- job_name: monitoring/my-stack-kube-prometheus-s-kube-scheduler/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: kube-prometheus-stack-kube-scheduler
    replacement: 
    action: keep
  - source_labels: [__meta_kubernetes_service_label_release]
    separator: ;
    regex: my-stack
    replacement: 
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-metrics
    replacement: 
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: 
    action: replace
  - source_labels: [__meta_kubernetes_service_label_jobLabel]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: 
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-metrics
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: 
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: 
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system

这是因为 Prometheus 正在监视这些目标的错误端点 and/or 目标不公开指标端点。

controller-manager为例:

  1. 更改绑定地址(默认值:127.0.0.1):
$ sudo vi /etc/kubernetes/manifests/kube-controller-manager.yaml
apiVersion: v1
kind: Pod
metadata:
  ...
spec:
  containers:
  - command:
    - kube-controller-manager
    ...
    - --bind-address=<your control-plane IP or 0.0.0.0>
    ...

如果您使用的是控制平面 IP,则还需要更改 livenessProbestartupProbe 主机。

  1. 更改端点(服务)端口(默认值:10252):
$ kubectl edit service prometheus-kube-prometheus-kube-controller-manager -n kube-system
apiVersion: v1
kind: Service
...
spec:
  clusterIP: None
  ports:
  - name: http-metrics
    port: 10257
    protocol: TCP
    targetPort: 10257
 ...
  1. 更改服务监控方案(默认:http):
$ kubectl edit servicemonitor prometheus-kube-prometheus-kube-controller-manager -n prometheus
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
...
spec:
  endpoints:
  - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
    port: http-metrics
    scheme: https
    tlsConfig:
      caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      insecureSkipVerify: true
  jobLabel: jobLabel
  ...

您可以更新 Kubeadm 配置文件来更改绑定地址

scheduler:
  extraArgs:
    address: 0.0.0.0
    bind-address: 0.0.0.0

...

controllerManager:
  extraArgs:
    address: 0.0.0.0
    bind-address: 0.0.0.0