Prometheus Federation 的 Grafana 仪表板设置
Grafana Dashboard setup for Prometheus Federation
我正在使用 prometheus federation 从多个 k8s 集群中抓取指标。它工作正常,我想在 grafana 上创建一些仪表板,我想按租户(集群)过滤仪表板。我正在尝试使用变量,但我不理解的东西,即使我没有指定某些东西特别适用于 kube_pod_container_status_restars_total
,它包含我在下面指定的标签 static_configs 但 kube_node_spec_unschedulable
不是。
那么这些差异从何而来,我应该怎么办?同时,设置可以按多个集群名称提供仪表板过滤器的仪表板的最佳实践方法是什么?我应该使用重新标记吗?
kube_pod_container_status_restarts_total{app="kube-state-metrics",container="backup",....,tenant="022"}
kube_node_spec_unschedulable{app="kube-state-metrics",....kubernetes_pod_name="kube-state-metrics-7d54b595f-r6m9k",node="022-kube-master01",pod_template_hash="7d54b595f"
普罗米修斯服务器
prometheus.yml:
rule_files:
- /etc/config/rules
- /etc/config/alerts
scrape_configs:
- job_name: prometheus
static_configs:
- targets:
- localhost:9090
中央集群
scrape_configs:
- job_name: federation_012
scrape_interval: 5m
scrape_timeout: 1m
honor_labels: true
honor_timestamps: true
metrics_path: /prometheus/federate
params:
'match[]':
- '{job!=""}'
scheme: https
static_configs:
- targets:
- host
labels:
tenant: 012
tls_config:
insecure_skip_verify: true
- job_name: federation_022
scrape_interval: 5m
scrape_timeout: 1m
honor_labels: true
honor_timestamps: true
metrics_path: /prometheus/federate
params:
'match[]':
- '{job!=""}'
scheme: https
static_configs:
- targets:
- host
labels:
tenant: 022
tls_config:
insecure_skip_verify: true
中央普罗米修斯服务器
scrape_configs:
- job_name: federate
scrape_interval: 5m
scrape_timeout: 1m
honor_labels: true
honor_timestamps: true
metrics_path: /prometheus/federate
params:
'match[]':
- '{job!=""}'
scheme: https
static_configs:
- targets:
- source_host_012
- source_host_022
tls_config:
insecure_skip_verify: true
来源普罗米修斯(租户012)
prometheus.yml:
rule_files:
- /etc/config/rules
- /etc/config/alerts
scrape_configs:
- job_name: tenant_012
static_configs:
- targets:
- localhost:9090
labels:
tenant: 012
来源普罗米修斯(租户022)
prometheus.yml:
rule_files:
- /etc/config/rules
- /etc/config/alerts
scrape_configs:
- job_name: tenant_022
static_configs:
- targets:
- localhost:9090
labels:
tenant: 022
如果您仍然没有获得所需的标签,请尝试将 relabel_configs
添加到您的 federate
作业中,并尝试通过源作业名称来区分指标:
relabel_configs:
- source_labels: [job]
target_label: tenant
或从 __address__
(或任何其他带有 __ 前缀的)标签中提取独特的信息。
relabel_configs:
- source_labels: [__address__]
target_label: tenant_host
PS:请记住,在目标重新标记完成后,以 __ 开头的标签将从标签集中删除。
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config
我正在使用 prometheus federation 从多个 k8s 集群中抓取指标。它工作正常,我想在 grafana 上创建一些仪表板,我想按租户(集群)过滤仪表板。我正在尝试使用变量,但我不理解的东西,即使我没有指定某些东西特别适用于 kube_pod_container_status_restars_total
,它包含我在下面指定的标签 static_configs 但 kube_node_spec_unschedulable
不是。
那么这些差异从何而来,我应该怎么办?同时,设置可以按多个集群名称提供仪表板过滤器的仪表板的最佳实践方法是什么?我应该使用重新标记吗?
kube_pod_container_status_restarts_total{app="kube-state-metrics",container="backup",....,tenant="022"}
kube_node_spec_unschedulable{app="kube-state-metrics",....kubernetes_pod_name="kube-state-metrics-7d54b595f-r6m9k",node="022-kube-master01",pod_template_hash="7d54b595f"
普罗米修斯服务器
prometheus.yml:
rule_files:
- /etc/config/rules
- /etc/config/alerts
scrape_configs:
- job_name: prometheus
static_configs:
- targets:
- localhost:9090
中央集群
scrape_configs:
- job_name: federation_012
scrape_interval: 5m
scrape_timeout: 1m
honor_labels: true
honor_timestamps: true
metrics_path: /prometheus/federate
params:
'match[]':
- '{job!=""}'
scheme: https
static_configs:
- targets:
- host
labels:
tenant: 012
tls_config:
insecure_skip_verify: true
- job_name: federation_022
scrape_interval: 5m
scrape_timeout: 1m
honor_labels: true
honor_timestamps: true
metrics_path: /prometheus/federate
params:
'match[]':
- '{job!=""}'
scheme: https
static_configs:
- targets:
- host
labels:
tenant: 022
tls_config:
insecure_skip_verify: true
中央普罗米修斯服务器
scrape_configs:
- job_name: federate
scrape_interval: 5m
scrape_timeout: 1m
honor_labels: true
honor_timestamps: true
metrics_path: /prometheus/federate
params:
'match[]':
- '{job!=""}'
scheme: https
static_configs:
- targets:
- source_host_012
- source_host_022
tls_config:
insecure_skip_verify: true
来源普罗米修斯(租户012)
prometheus.yml:
rule_files:
- /etc/config/rules
- /etc/config/alerts
scrape_configs:
- job_name: tenant_012
static_configs:
- targets:
- localhost:9090
labels:
tenant: 012
来源普罗米修斯(租户022)
prometheus.yml:
rule_files:
- /etc/config/rules
- /etc/config/alerts
scrape_configs:
- job_name: tenant_022
static_configs:
- targets:
- localhost:9090
labels:
tenant: 022
如果您仍然没有获得所需的标签,请尝试将 relabel_configs
添加到您的 federate
作业中,并尝试通过源作业名称来区分指标:
relabel_configs:
- source_labels: [job]
target_label: tenant
或从 __address__
(或任何其他带有 __ 前缀的)标签中提取独特的信息。
relabel_configs:
- source_labels: [__address__]
target_label: tenant_host
PS:请记住,在目标重新标记完成后,以 __ 开头的标签将从标签集中删除。
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config