是否可以避免为已触发的警报发送重复的 Slack 通知?
Is it possible to avoid sending repeated Slack notifications for already fired alert?
免责声明:第一次使用 Prometheus。
我试图在每次作业成功结束时发送 Slack 通知。
为此,我安装了 kube-state-metrics、Prometheus 和 AlertManager。
然后我创建了以下规则:
rules:
- alert: KubeJobCompleted
annotations:
identifier: '{{ $labels.instance }}'
summary: Job Completed Successfully
description: Job *{{ $labels.namespace }}/{{ $labels.job_name }}* is completed successfully.
expr: |
kube_job_spec_completions{job="kube-state-metrics"} - kube_job_status_succeeded{job="kube-state-metrics"} == 0
labels:
severity: information
并添加了 AlertManager 接收器文本(模板):
{{ define "custom_slack_message" }}
{{ range .Alerts }}
{{ .Annotations.description }}
{{ end }}
{{ end }}
我目前的结果:每次新作业成功完成时,我都会收到一条 Slack 通知,其中包含所有成功完成的作业的列表。
我不介意一开始收到整个列表,但之后我希望收到仅包含指定组间隔内新完成的作业的通知。
可能吗?
只需添加仅显示最后完成的作业的额外规则:
行:for: <10m>
- 将列出 10 分钟内刚刚完成的工作:
rules:
- alert: KubeJobCompleted
annotations:
identifier: '{{ $labels.instance }}'
summary: Job Completed Successfully
description: Job *{{ $labels.namespace }}/{{ $labels.job_name }}* is completed successfully.
expr: |
kube_job_spec_completions{job="kube-state-metrics"} - kube_job_status_succeeded{job="kube-state-metrics"} == 0
for: 10m
labels:
severity: information
我最终使用 kube_job_status_completion_time 和 time() 来消除过去的事件(避免在重复时间时重新触发事件) .
rules:
- alert: KubeJobCompleted
annotations:
identifier: '{{ $labels.instance }}'
summary: Job Completed Successfully
description: Job *{{ $labels.namespace }}/{{ $labels.job_name }}* is completed successfully.
expr: |
time() - kube_job_status_completion_time < 60 and kube_job_spec_completions{job="kube-state-metrics"} - kube_job_status_succeeded{job="kube-state-metrics"} == 0
labels:
severity: information
免责声明:第一次使用 Prometheus。
我试图在每次作业成功结束时发送 Slack 通知。
为此,我安装了 kube-state-metrics、Prometheus 和 AlertManager。
然后我创建了以下规则:
rules:
- alert: KubeJobCompleted
annotations:
identifier: '{{ $labels.instance }}'
summary: Job Completed Successfully
description: Job *{{ $labels.namespace }}/{{ $labels.job_name }}* is completed successfully.
expr: |
kube_job_spec_completions{job="kube-state-metrics"} - kube_job_status_succeeded{job="kube-state-metrics"} == 0
labels:
severity: information
并添加了 AlertManager 接收器文本(模板):
{{ define "custom_slack_message" }}
{{ range .Alerts }}
{{ .Annotations.description }}
{{ end }}
{{ end }}
我目前的结果:每次新作业成功完成时,我都会收到一条 Slack 通知,其中包含所有成功完成的作业的列表。
我不介意一开始收到整个列表,但之后我希望收到仅包含指定组间隔内新完成的作业的通知。
可能吗?
只需添加仅显示最后完成的作业的额外规则:
行:for: <10m>
- 将列出 10 分钟内刚刚完成的工作:
rules:
- alert: KubeJobCompleted
annotations:
identifier: '{{ $labels.instance }}'
summary: Job Completed Successfully
description: Job *{{ $labels.namespace }}/{{ $labels.job_name }}* is completed successfully.
expr: |
kube_job_spec_completions{job="kube-state-metrics"} - kube_job_status_succeeded{job="kube-state-metrics"} == 0
for: 10m
labels:
severity: information
我最终使用 kube_job_status_completion_time 和 time() 来消除过去的事件(避免在重复时间时重新触发事件) .
rules:
- alert: KubeJobCompleted
annotations:
identifier: '{{ $labels.instance }}'
summary: Job Completed Successfully
description: Job *{{ $labels.namespace }}/{{ $labels.job_name }}* is completed successfully.
expr: |
time() - kube_job_status_completion_time < 60 and kube_job_spec_completions{job="kube-state-metrics"} - kube_job_status_succeeded{job="kube-state-metrics"} == 0
labels:
severity: information