Helm / kube-prometheus-stack:我可以在 values.yaml 中为出口商创建规则吗?
Helm / kube-prometheus-stack: Can I create rules for exporters in values.yaml?
我希望能够为 prometheus-blackbox-exporter
指定我的所有规则,因此已将其添加到 rules-mine.yaml
并使用
进行部署
helm upgrade --install -n monitoring blackbox -f values.yaml -f rules-mine.yaml .
我看不到 http://localhost:9090/rules 中列出的任何规则,似乎没有任何东西被评估为没有警报....我需要以 IaC 的身份做所有事情,并以自动化方式通过 Terraform 进行部署。
- 是否可以通过这种方式向出口商添加规则?
- 如果是这样,那么有人能看出下面的文件有问题吗?
- 如果没有,我如何才能有效地向多个导出器添加规则?
rules-mine.yaml
文件包含:
prometheusRule:
enabled: true
namespace: monitoring
additionalLabels:
team: foxtrot_blackbox
environment: production
cluster: cluster
namespace: namespace_x
namespace: "monitoring"
rules:
- alert: BlackboxProbeFailed
expr: probe_success == 0
for: 0m
labels:
severity: critical
annotations:
summary: Blackbox probe failed (instance {{`{{`}} $labels.instance {{`}}`}})
description: "Probe failed\n VALUE = {{`{{`}} $value {{`}}`}}"
- alert: BlackboxSlowProbe
expr: avg_over_time(probe_duration_seconds[1m]) > 1
for: 1m
labels:
severity: warning
annotations:
summary: Blackbox slow probe (instance {{`{{`}} $labels.instance {{`}}`}})
description: "Blackbox probe took more than 1s to complete\n VALUE = {{`{{`}} $value {{`}}`}}"
感谢您的帮助....
您确定您没有在标签名称中输入错误:“环境”吗?
这肯定与您的期望不符,除非您实际标记了您的来源。
最佳
我发现的最好方法似乎是将导出器规则添加到 kube-prometheus-stack
values.yaml
文件(我实际上创建了一个单独的 rules.yaml
文件)并将其提供给 helm:
helm upgrade --install -n monitoring prometheus --create-namespace -f values-mine.yaml -f rules-mine.yaml prometheus-community/kube-prometheus-stack
然后按照我的意愿选择了所有规则,这似乎是一个不错的解决方案。但我仍然希望他们与出口商归为一类 - 如果我找到解决方案,我会再次 post。
additionalPrometheusRulesMap:
prometheus.rules:
groups:
- name: company.prometheus.rules
rules:
- alert: PrometheusNotificationsBacklog
expr: min_over_time(prometheus_notifications_queue_length[10m]) > 0
for: 0m
labels:
severity: warning
annotations:
summary: Prometheus notifications backlog (instance {{ $labels.instance }})
description: The Prometheus notification queue has not been empty for 10 minutes\nVALUE = {{ $value }}
dashboard_url: ${grafana_url}/d/blackbox/blackbox-exporter?var-instance={{ $labels.instance }}
runbook_url: ${wiki_url}/{{ $labels.alertname }}
company.blackbox.rules:
groups:
- name: company.blackbox.rules
rules:
- alert: BlackboxProbeFailed
expr: probe_success == 0
for: 1m
labels:
severity: critical
annotations:
summary: Blackbox probe failed (instance {{ $labels.instance }})
description: Probe failed\nVALUE = {{ $value }}
dashboard_url: ${grafana_url}/d/blackbox/blackbox-exporter?var-instance={{ $labels.instance }}
runbook_url: ${wiki_url}/{{ $labels.alertname }}
- alert: BlackboxSlowProbe
expr: avg_over_time(probe_duration_seconds[1m]) > 1
for: 3m
labels:
severity: warning
annotations:
summary: Blackbox slow probe (instance {{ $labels.instance }})
description: "Blackbox probe took more than 1s to complete\nVALUE = {{ $value }}"
dashboard_url: ${grafana_url}/d/blackbox/blackbox-exporter?var-instance={{ $labels.instance }}
runbook_url: ${wiki_url}/{{ $labels.alertname }}
# etc....
同事发现这是完全可以的。它似乎与原始实现中使用的引号有关。以下内容现在正在使用和工作中,因此张贴在这里希望对其他人有用。
综上所述,
{{`{{`}} $labels.instance {{`}}`}}
== 不好
{{`{{$labels.instance}}`}}
== 好
prometheusRule:
enabled: true
additionalLabels:
client: ${client_id}
cluster: ${cluster}
environment: ${environment}
grafana: ${grafana_url}
rules:
- alert: BlackboxProbeFailed
expr: probe_success == 0
for: 1m
labels:
severity: critical
annotations:
summary: Blackbox probe failed for {{`{{$labels.instance}}`}}
description: Probe failed VALUE = {{`{{$value}}`}}
dashboard_url: https://${grafana_url}/d/blackbox/blackbox-exporter?var-instance={{`{{$labels.instance}}`}}
runbook_url: ${wiki_url}/BlackboxProbeFailed
- alert: BlackboxSlowProbe
expr: avg_over_time(probe_duration_seconds[1m]) > 1
for: 2m
labels:
severity: warning
annotations:
summary: Blackbox slow probe for {{`{{$labels.instance}}`}}
description: Blackbox probe took more than 1s to complete VALUE = {{`{{$value|humanizeDuration}}`}}
dashboard_url: https://${grafana_url}/d/blackbox/blackbox-exporter?var-instance={{`{{$labels.instance}}`}}
runbook_url: ${wiki_url}/BlackboxSlowProbe
请忽略任何遗漏的变量等
我希望能够为 prometheus-blackbox-exporter
指定我的所有规则,因此已将其添加到 rules-mine.yaml
并使用
helm upgrade --install -n monitoring blackbox -f values.yaml -f rules-mine.yaml .
我看不到 http://localhost:9090/rules 中列出的任何规则,似乎没有任何东西被评估为没有警报....我需要以 IaC 的身份做所有事情,并以自动化方式通过 Terraform 进行部署。
- 是否可以通过这种方式向出口商添加规则?
- 如果是这样,那么有人能看出下面的文件有问题吗?
- 如果没有,我如何才能有效地向多个导出器添加规则?
rules-mine.yaml
文件包含:
prometheusRule:
enabled: true
namespace: monitoring
additionalLabels:
team: foxtrot_blackbox
environment: production
cluster: cluster
namespace: namespace_x
namespace: "monitoring"
rules:
- alert: BlackboxProbeFailed
expr: probe_success == 0
for: 0m
labels:
severity: critical
annotations:
summary: Blackbox probe failed (instance {{`{{`}} $labels.instance {{`}}`}})
description: "Probe failed\n VALUE = {{`{{`}} $value {{`}}`}}"
- alert: BlackboxSlowProbe
expr: avg_over_time(probe_duration_seconds[1m]) > 1
for: 1m
labels:
severity: warning
annotations:
summary: Blackbox slow probe (instance {{`{{`}} $labels.instance {{`}}`}})
description: "Blackbox probe took more than 1s to complete\n VALUE = {{`{{`}} $value {{`}}`}}"
感谢您的帮助....
您确定您没有在标签名称中输入错误:“环境”吗? 这肯定与您的期望不符,除非您实际标记了您的来源。
最佳
我发现的最好方法似乎是将导出器规则添加到 kube-prometheus-stack
values.yaml
文件(我实际上创建了一个单独的 rules.yaml
文件)并将其提供给 helm:
helm upgrade --install -n monitoring prometheus --create-namespace -f values-mine.yaml -f rules-mine.yaml prometheus-community/kube-prometheus-stack
然后按照我的意愿选择了所有规则,这似乎是一个不错的解决方案。但我仍然希望他们与出口商归为一类 - 如果我找到解决方案,我会再次 post。
additionalPrometheusRulesMap:
prometheus.rules:
groups:
- name: company.prometheus.rules
rules:
- alert: PrometheusNotificationsBacklog
expr: min_over_time(prometheus_notifications_queue_length[10m]) > 0
for: 0m
labels:
severity: warning
annotations:
summary: Prometheus notifications backlog (instance {{ $labels.instance }})
description: The Prometheus notification queue has not been empty for 10 minutes\nVALUE = {{ $value }}
dashboard_url: ${grafana_url}/d/blackbox/blackbox-exporter?var-instance={{ $labels.instance }}
runbook_url: ${wiki_url}/{{ $labels.alertname }}
company.blackbox.rules:
groups:
- name: company.blackbox.rules
rules:
- alert: BlackboxProbeFailed
expr: probe_success == 0
for: 1m
labels:
severity: critical
annotations:
summary: Blackbox probe failed (instance {{ $labels.instance }})
description: Probe failed\nVALUE = {{ $value }}
dashboard_url: ${grafana_url}/d/blackbox/blackbox-exporter?var-instance={{ $labels.instance }}
runbook_url: ${wiki_url}/{{ $labels.alertname }}
- alert: BlackboxSlowProbe
expr: avg_over_time(probe_duration_seconds[1m]) > 1
for: 3m
labels:
severity: warning
annotations:
summary: Blackbox slow probe (instance {{ $labels.instance }})
description: "Blackbox probe took more than 1s to complete\nVALUE = {{ $value }}"
dashboard_url: ${grafana_url}/d/blackbox/blackbox-exporter?var-instance={{ $labels.instance }}
runbook_url: ${wiki_url}/{{ $labels.alertname }}
# etc....
同事发现这是完全可以的。它似乎与原始实现中使用的引号有关。以下内容现在正在使用和工作中,因此张贴在这里希望对其他人有用。
综上所述,
{{`{{`}} $labels.instance {{`}}`}}
== 不好{{`{{$labels.instance}}`}}
== 好
prometheusRule:
enabled: true
additionalLabels:
client: ${client_id}
cluster: ${cluster}
environment: ${environment}
grafana: ${grafana_url}
rules:
- alert: BlackboxProbeFailed
expr: probe_success == 0
for: 1m
labels:
severity: critical
annotations:
summary: Blackbox probe failed for {{`{{$labels.instance}}`}}
description: Probe failed VALUE = {{`{{$value}}`}}
dashboard_url: https://${grafana_url}/d/blackbox/blackbox-exporter?var-instance={{`{{$labels.instance}}`}}
runbook_url: ${wiki_url}/BlackboxProbeFailed
- alert: BlackboxSlowProbe
expr: avg_over_time(probe_duration_seconds[1m]) > 1
for: 2m
labels:
severity: warning
annotations:
summary: Blackbox slow probe for {{`{{$labels.instance}}`}}
description: Blackbox probe took more than 1s to complete VALUE = {{`{{$value|humanizeDuration}}`}}
dashboard_url: https://${grafana_url}/d/blackbox/blackbox-exporter?var-instance={{`{{$labels.instance}}`}}
runbook_url: ${wiki_url}/BlackboxSlowProbe
请忽略任何遗漏的变量等