Alertmanager 未加载 webhook_config
Alertmanager does not load webhook_config
我想为 alertmanager 创建新的接收器和路由以将心跳发送到 OpsGenie。
我试图通过定义 opsgenie_config 来实现它,但我无法将 ping 发送到 OpsGenie 中的心跳(我能够使用相同的 api 密钥向 OpsGenie 发送警报)。
我发现的另一种方法是使用 webhook_config(如 #444 中所建议),我的清单如下所示:
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
name: opsgenie-webhook
labels:
managedBy: team-sre
spec:
receivers:
- name: heartbeat
webhookConfigs:
- httpConfig:
basicAuth:
password:
name: opsgenie-api-key
key: address
url: https://api.opsgenie.com/v2/heartbeats/sre-test-cluster/ping
route:
groupWait: 0s
repeatInterval: 1m
groupInterval: 1m
matchers:
- name: alertname
value: Watchdog
receiver: heartbeat
当我应用清单时,描述的接收器和路由没有加载到 Alertmanager。当我检查日志时,没有记录错误,但也没有消息表明 sidecar 试图加载新的 alertmanagerconfig。
有没有人遇到同样的问题并且知道如何解决?
我在 github issue #3970 上找到了解决方案
要接受 basicAuth,必须提供用户名和密码。
不错的技巧是将用户名设置为 : base64 格式 (Og==)。
清单应定义如下:
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
labels:
managedBy: team-sre
name: alertmanager-opsgenie-config
namespace: monitoring
spec:
receivers:
- name: deadmansswitch
webhookConfigs:
# url link to the specific heartbeat, replace test with heartbeat name
- url: 'https://api.opsgenie.com/v2/heartbeats/<hearbeat-name>/ping'
sendResolved: true
httpConfig:
basicAuth:
# reference to secret containing login credentals
password:
key: apiKey
name: opsgenie
username:
key: username
name: opsgenie
route:
groupBy:
- job
groupInterval: 10s
groupWait: 0s
repeatInterval: 10s
matchers:
- name: alertname
value: Watchdog
- name: namespace
value: monitoring
receiver: deadmansswitch
---
apiVersion: v1
kind: Secret
metadata:
namespace: monitoring
name: opsgenie
type: Opaque
data:
# apiKey in encoded in base64
apiKey: YOUR_PASSWORD
# ':' in base 64 - fix suggested in https://github.com/prometheus-operator/prometheus-operator/issues/3970#issuecomment-888893008
username: Og==
在应用清单并触发符合条件的警报定义后,Opsgenie 受到检测信号的影响。
我想为 alertmanager 创建新的接收器和路由以将心跳发送到 OpsGenie。
我试图通过定义 opsgenie_config 来实现它,但我无法将 ping 发送到 OpsGenie 中的心跳(我能够使用相同的 api 密钥向 OpsGenie 发送警报)。
我发现的另一种方法是使用 webhook_config(如 #444 中所建议),我的清单如下所示:
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
name: opsgenie-webhook
labels:
managedBy: team-sre
spec:
receivers:
- name: heartbeat
webhookConfigs:
- httpConfig:
basicAuth:
password:
name: opsgenie-api-key
key: address
url: https://api.opsgenie.com/v2/heartbeats/sre-test-cluster/ping
route:
groupWait: 0s
repeatInterval: 1m
groupInterval: 1m
matchers:
- name: alertname
value: Watchdog
receiver: heartbeat
当我应用清单时,描述的接收器和路由没有加载到 Alertmanager。当我检查日志时,没有记录错误,但也没有消息表明 sidecar 试图加载新的 alertmanagerconfig。
有没有人遇到同样的问题并且知道如何解决?
我在 github issue #3970 上找到了解决方案 要接受 basicAuth,必须提供用户名和密码。 不错的技巧是将用户名设置为 : base64 格式 (Og==)。 清单应定义如下:
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
labels:
managedBy: team-sre
name: alertmanager-opsgenie-config
namespace: monitoring
spec:
receivers:
- name: deadmansswitch
webhookConfigs:
# url link to the specific heartbeat, replace test with heartbeat name
- url: 'https://api.opsgenie.com/v2/heartbeats/<hearbeat-name>/ping'
sendResolved: true
httpConfig:
basicAuth:
# reference to secret containing login credentals
password:
key: apiKey
name: opsgenie
username:
key: username
name: opsgenie
route:
groupBy:
- job
groupInterval: 10s
groupWait: 0s
repeatInterval: 10s
matchers:
- name: alertname
value: Watchdog
- name: namespace
value: monitoring
receiver: deadmansswitch
---
apiVersion: v1
kind: Secret
metadata:
namespace: monitoring
name: opsgenie
type: Opaque
data:
# apiKey in encoded in base64
apiKey: YOUR_PASSWORD
# ':' in base 64 - fix suggested in https://github.com/prometheus-operator/prometheus-operator/issues/3970#issuecomment-888893008
username: Og==
在应用清单并触发符合条件的警报定义后,Opsgenie 受到检测信号的影响。