kube-prometheus-stack 问题抓取指标
kube-prometheus-stack issue scraping metrics
一般集群信息:
- Kubernetes 版本:1.19.13
- 正在使用的云:私有
- 安装方式:kubeadm init
- 主机OS:Ubuntu 20.04.1 LTS
- CNI及版本:织网:2.7.0
- CRI 和版本:Docker:19.3.13
我正在尝试让 kube-prometheus-stack
helm chart 正常工作。这似乎对大多数目标都有效,但是,一些目标保持关闭状态,如下面的屏幕截图所示。
有什么建议吗,如何让 kube-etcd
、kube-controller-manager
和 kube-scheduler
受到 Prometheus
的监控?
我如前所述 here and applied the suggestion here 部署了 helm chart 以获取 Prometheus
监控的 kube-proxy。
在此先感谢您的帮助!
编辑 1:
- job_name: monitoring/my-stack-kube-prometheus-s-kube-controller-manager/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_service_label_app]
separator: ;
regex: kube-prometheus-stack-kube-controller-manager
replacement:
action: keep
- source_labels: [__meta_kubernetes_service_label_release]
separator: ;
regex: my-stack
replacement:
action: keep
- source_labels: [__meta_kubernetes_endpoint_port_name]
separator: ;
regex: http-metrics
replacement:
action: keep
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Node;(.*)
target_label: node
replacement:
action: replace
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Pod;(.*)
target_label: pod
replacement:
action: replace
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: service
replacement:
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement:
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: job
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_label_jobLabel]
separator: ;
regex: (.+)
target_label: job
replacement:
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: http-metrics
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement:
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement:
action: keep
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
- job_name: monitoring/my-stack-kube-prometheus-s-kube-etcd/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_service_label_app]
separator: ;
regex: kube-prometheus-stack-kube-etcd
replacement:
action: keep
- source_labels: [__meta_kubernetes_service_label_release]
separator: ;
regex: my-stack
replacement:
action: keep
- source_labels: [__meta_kubernetes_endpoint_port_name]
separator: ;
regex: http-metrics
replacement:
action: keep
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Node;(.*)
target_label: node
replacement:
action: replace
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Pod;(.*)
target_label: pod
replacement:
action: replace
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: service
replacement:
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement:
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: job
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_label_jobLabel]
separator: ;
regex: (.+)
target_label: job
replacement:
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: http-metrics
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement:
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement:
action: keep
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
- job_name: monitoring/my-stack-kube-prometheus-s-kube-scheduler/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_service_label_app]
separator: ;
regex: kube-prometheus-stack-kube-scheduler
replacement:
action: keep
- source_labels: [__meta_kubernetes_service_label_release]
separator: ;
regex: my-stack
replacement:
action: keep
- source_labels: [__meta_kubernetes_endpoint_port_name]
separator: ;
regex: http-metrics
replacement:
action: keep
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Node;(.*)
target_label: node
replacement:
action: replace
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Pod;(.*)
target_label: pod
replacement:
action: replace
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: service
replacement:
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement:
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: job
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_label_jobLabel]
separator: ;
regex: (.+)
target_label: job
replacement:
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: http-metrics
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement:
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement:
action: keep
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
这是因为 Prometheus
正在监视这些目标的错误端点 and/or 目标不公开指标端点。
以controller-manager
为例:
- 更改绑定地址(默认值:127.0.0.1):
$ sudo vi /etc/kubernetes/manifests/kube-controller-manager.yaml
apiVersion: v1
kind: Pod
metadata:
...
spec:
containers:
- command:
- kube-controller-manager
...
- --bind-address=<your control-plane IP or 0.0.0.0>
...
如果您使用的是控制平面 IP,则还需要更改 livenessProbe
和 startupProbe
主机。
- 更改端点(服务)端口(默认值:10252):
$ kubectl edit service prometheus-kube-prometheus-kube-controller-manager -n kube-system
apiVersion: v1
kind: Service
...
spec:
clusterIP: None
ports:
- name: http-metrics
port: 10257
protocol: TCP
targetPort: 10257
...
- 更改服务监控方案(默认:http):
$ kubectl edit servicemonitor prometheus-kube-prometheus-kube-controller-manager -n prometheus
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
...
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
port: http-metrics
scheme: https
tlsConfig:
caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecureSkipVerify: true
jobLabel: jobLabel
...
您可以更新 Kubeadm 配置文件来更改绑定地址
scheduler:
extraArgs:
address: 0.0.0.0
bind-address: 0.0.0.0
...
controllerManager:
extraArgs:
address: 0.0.0.0
bind-address: 0.0.0.0
一般集群信息:
- Kubernetes 版本:1.19.13
- 正在使用的云:私有
- 安装方式:kubeadm init
- 主机OS:Ubuntu 20.04.1 LTS
- CNI及版本:织网:2.7.0
- CRI 和版本:Docker:19.3.13
我正在尝试让 kube-prometheus-stack
helm chart 正常工作。这似乎对大多数目标都有效,但是,一些目标保持关闭状态,如下面的屏幕截图所示。
有什么建议吗,如何让 kube-etcd
、kube-controller-manager
和 kube-scheduler
受到 Prometheus
的监控?
我如前所述 here and applied the suggestion here 部署了 helm chart 以获取 Prometheus
监控的 kube-proxy。
在此先感谢您的帮助!
编辑 1:
- job_name: monitoring/my-stack-kube-prometheus-s-kube-controller-manager/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_service_label_app]
separator: ;
regex: kube-prometheus-stack-kube-controller-manager
replacement:
action: keep
- source_labels: [__meta_kubernetes_service_label_release]
separator: ;
regex: my-stack
replacement:
action: keep
- source_labels: [__meta_kubernetes_endpoint_port_name]
separator: ;
regex: http-metrics
replacement:
action: keep
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Node;(.*)
target_label: node
replacement:
action: replace
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Pod;(.*)
target_label: pod
replacement:
action: replace
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: service
replacement:
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement:
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: job
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_label_jobLabel]
separator: ;
regex: (.+)
target_label: job
replacement:
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: http-metrics
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement:
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement:
action: keep
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
- job_name: monitoring/my-stack-kube-prometheus-s-kube-etcd/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_service_label_app]
separator: ;
regex: kube-prometheus-stack-kube-etcd
replacement:
action: keep
- source_labels: [__meta_kubernetes_service_label_release]
separator: ;
regex: my-stack
replacement:
action: keep
- source_labels: [__meta_kubernetes_endpoint_port_name]
separator: ;
regex: http-metrics
replacement:
action: keep
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Node;(.*)
target_label: node
replacement:
action: replace
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Pod;(.*)
target_label: pod
replacement:
action: replace
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: service
replacement:
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement:
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: job
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_label_jobLabel]
separator: ;
regex: (.+)
target_label: job
replacement:
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: http-metrics
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement:
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement:
action: keep
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
- job_name: monitoring/my-stack-kube-prometheus-s-kube-scheduler/0
honor_timestamps: true
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_service_label_app]
separator: ;
regex: kube-prometheus-stack-kube-scheduler
replacement:
action: keep
- source_labels: [__meta_kubernetes_service_label_release]
separator: ;
regex: my-stack
replacement:
action: keep
- source_labels: [__meta_kubernetes_endpoint_port_name]
separator: ;
regex: http-metrics
replacement:
action: keep
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Node;(.*)
target_label: node
replacement:
action: replace
- source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
separator: ;
regex: Pod;(.*)
target_label: pod
replacement:
action: replace
- source_labels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
target_label: namespace
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: service
replacement:
action: replace
- source_labels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
target_label: pod
replacement:
action: replace
- source_labels: [__meta_kubernetes_pod_container_name]
separator: ;
regex: (.*)
target_label: container
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_name]
separator: ;
regex: (.*)
target_label: job
replacement:
action: replace
- source_labels: [__meta_kubernetes_service_label_jobLabel]
separator: ;
regex: (.+)
target_label: job
replacement:
action: replace
- separator: ;
regex: (.*)
target_label: endpoint
replacement: http-metrics
action: replace
- source_labels: [__address__]
separator: ;
regex: (.*)
modulus: 1
target_label: __tmp_hash
replacement:
action: hashmod
- source_labels: [__tmp_hash]
separator: ;
regex: "0"
replacement:
action: keep
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
这是因为 Prometheus
正在监视这些目标的错误端点 and/or 目标不公开指标端点。
以controller-manager
为例:
- 更改绑定地址(默认值:127.0.0.1):
$ sudo vi /etc/kubernetes/manifests/kube-controller-manager.yaml
apiVersion: v1
kind: Pod
metadata:
...
spec:
containers:
- command:
- kube-controller-manager
...
- --bind-address=<your control-plane IP or 0.0.0.0>
...
如果您使用的是控制平面 IP,则还需要更改 livenessProbe
和 startupProbe
主机。
- 更改端点(服务)端口(默认值:10252):
$ kubectl edit service prometheus-kube-prometheus-kube-controller-manager -n kube-system
apiVersion: v1
kind: Service
...
spec:
clusterIP: None
ports:
- name: http-metrics
port: 10257
protocol: TCP
targetPort: 10257
...
- 更改服务监控方案(默认:http):
$ kubectl edit servicemonitor prometheus-kube-prometheus-kube-controller-manager -n prometheus
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
...
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
port: http-metrics
scheme: https
tlsConfig:
caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecureSkipVerify: true
jobLabel: jobLabel
...
您可以更新 Kubeadm 配置文件来更改绑定地址
scheduler:
extraArgs:
address: 0.0.0.0
bind-address: 0.0.0.0
...
controllerManager:
extraArgs:
address: 0.0.0.0
bind-address: 0.0.0.0