如何从升级通知模板中获取 icinga2 中的单个确认/确定通知?
How to get a single acknowledgement / OK notification in icinga2 from escalating notification templates?
在 icinga2 监控中,我希望能够在服务已关闭一段时间后升级问题通知,或者在停止营业时间时降级。我想在服务恢复时收到一条通知。
当我将 "service-test-down-1" 和 "service-test-down-2" 设置为所有类型和状态时,当服务正常时我会收到两条 "OK" 消息。当我像下面这样设置它时,将 OK 消息和 Not-OK 消息分开,我从来没有得到任何 OK。我觉得这应该是直截了当的,但我一直没有取得任何进展。
apply Notification "service-test-down-1" to Service {
command = "dispatch-service"
states = [ Warning, Critical, Unknown ]
types = [ Problem, Custom, FlappingStart, FlappingEnd,
DowntimeStart, DowntimeEnd, DowntimeRemoved ]
users = ["russ"]
period = "24x7"
assign where "tests" in service.groups
vars.priority = "medium"
times.begin = 0m
times.end = 3m
interval = 1m
}
apply Notification "service-test-down-2" to Service {
command = "dispatch-service"
states = [ Warning, Critical, Unknown ]
types = [ Problem, Custom, FlappingStart, FlappingEnd,
DowntimeStart, DowntimeEnd, DowntimeRemoved ]
period = "24x7"
users = ["russ"]
assign where "tests" in service.groups
vars.priority = "medium"
times.begin = 3m
times.end = 12h
interval = 2m
}
apply Notification "service-test-recovery" to Service {
command = "dispatch-service"
states = [ OK ]
types = [ Acknowledgement, Recovery ]
users = ["russ"]
period = "24x7"
vars.priority = "medium"
assign where "tests" in service.groups
interval = 1
}
apply Service "NotificationTest" {
enable_active_checks = true
check_command = "passive"
max_check_attempts = 1
ignore where host.vars.noservices == true
groups += ["tests"]
assign where host.name == "icinga2.acceleration.net"
max_check_attempts = 5
check_interval = 5m
retry_interval = 5m
}
此配置由 icinga 打印为:
~# icinga2 object list --name service-test-*
Object 'icinga2.acceleration.net!NotificationTest!service-test-down-1' of type 'Notification':
% declared in '/opt/icinga2lib/lib.conf.d//test.conf', lines 2:1-2:51
* __name = "icinga2.acceleration.net!NotificationTest!service-test-down-1"
* command = "dispatch-service"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 3:3-3:30
* command_endpoint = ""
* host_name = "icinga2.acceleration.net"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 2:1-2:51
* interval = 60
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 13:3-13:15
* name = "service-test-down-1"
* package = "_etc"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 2:1-2:51
* period = "24x7"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 8:3-8:17
* service_name = "NotificationTest"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 2:1-2:51
* states = [ "Warning", "Critical", "Unknown" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 4:3-4:41
* templates = [ "service-test-down-1" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 2:1-2:51
* times
* begin = 0
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 11:3-11:18
* end = 180
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 12:3-12:16
* type = "Notification"
* types = [ "Problem", "Custom", "FlappingStart", "FlappingEnd", "DowntimeStart", "DowntimeEnd", "DowntimeRemoved" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 5:3-6:57
* user_groups = null
* users = [ "russ" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 7:3-7:18
* vars
* priority = "medium"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 10:3-10:26
* zone = ""
Object 'icinga2.acceleration.net!NotificationTest!service-test-down-2' of type 'Notification':
% declared in '/opt/icinga2lib/lib.conf.d//test.conf', lines 16:1-16:51
* __name = "icinga2.acceleration.net!NotificationTest!service-test-down-2"
* command = "dispatch-service"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 17:3-17:30
* command_endpoint = ""
* host_name = "icinga2.acceleration.net"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 16:1-16:51
* interval = 120
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 27:3-27:15
* name = "service-test-down-2"
* package = "_etc"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 16:1-16:51
* period = "24x7"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 21:3-21:17
* service_name = "NotificationTest"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 16:1-16:51
* states = [ "Warning", "Critical", "Unknown" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 18:3-18:41
* templates = [ "service-test-down-2" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 16:1-16:51
* times
* begin = 180
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 25:3-25:18
* end = 43200
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 26:3-26:17
* type = "Notification"
* types = [ "Problem", "Custom", "FlappingStart", "FlappingEnd", "DowntimeStart", "DowntimeEnd", "DowntimeRemoved" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 19:3-20:57
* user_groups = null
* users = [ "russ" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 22:3-22:18
* vars
* priority = "medium"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 24:3-24:26
* zone = ""
Object 'icinga2.acceleration.net!NotificationTest!service-test-recovery' of type 'Notification':
% declared in '/opt/icinga2lib/lib.conf.d//test.conf', lines 29:1-29:53
* __name = "icinga2.acceleration.net!NotificationTest!service-test-recovery"
* command = "dispatch-service"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 30:3-30:30
* command_endpoint = ""
* host_name = "icinga2.acceleration.net"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 29:1-29:53
* interval = 1
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 37:3-37:14
* name = "service-test-recovery"
* package = "_etc"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 29:1-29:53
* period = "24x7"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 34:3-34:17
* service_name = "NotificationTest"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 29:1-29:53
* states = [ "OK" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 31:3-31:17
* templates = [ "service-test-recovery" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 29:1-29:53
* times = null
* type = "Notification"
* types = [ "Acknowledgement", "Recovery" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 32:3-32:39
* user_groups = null
* users = [ "russ" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 33:3-33:18
* vars
* priority = "medium"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 35:3-35:26
* zone = ""
相关文档链接:
https://www.icinga.com/docs/icinga2/latest/doc/03-monitoring-basics/#notification-escalations
icinga2 的开发者(在 github:https://github.com/Icinga/icinga2/issues/5478)提供的答案是,当您有通知升级时,无法从 icinga2 内部发送单个恢复通知。
每个升级都是一个单独的通知对象,每个通知问题的通知都将发送一条恢复消息。不会向任何从未发送过 PROBLEM 通知的通知对象发送任何 RECOVERY 通知(这似乎是错误的,但无论如何)。
建议的解决方案是使用通知代理为您删除重复的消息。鉴于不希望有状态代理,我做了一个函数来设置 current_escalation 在通知的主机/服务上,这样只有当前升级才会实际发送 RECOVERY 消息,我们的代理仍然可以是无状态的。 github.
上的示例代码
在 icinga2 监控中,我希望能够在服务已关闭一段时间后升级问题通知,或者在停止营业时间时降级。我想在服务恢复时收到一条通知。
当我将 "service-test-down-1" 和 "service-test-down-2" 设置为所有类型和状态时,当服务正常时我会收到两条 "OK" 消息。当我像下面这样设置它时,将 OK 消息和 Not-OK 消息分开,我从来没有得到任何 OK。我觉得这应该是直截了当的,但我一直没有取得任何进展。
apply Notification "service-test-down-1" to Service {
command = "dispatch-service"
states = [ Warning, Critical, Unknown ]
types = [ Problem, Custom, FlappingStart, FlappingEnd,
DowntimeStart, DowntimeEnd, DowntimeRemoved ]
users = ["russ"]
period = "24x7"
assign where "tests" in service.groups
vars.priority = "medium"
times.begin = 0m
times.end = 3m
interval = 1m
}
apply Notification "service-test-down-2" to Service {
command = "dispatch-service"
states = [ Warning, Critical, Unknown ]
types = [ Problem, Custom, FlappingStart, FlappingEnd,
DowntimeStart, DowntimeEnd, DowntimeRemoved ]
period = "24x7"
users = ["russ"]
assign where "tests" in service.groups
vars.priority = "medium"
times.begin = 3m
times.end = 12h
interval = 2m
}
apply Notification "service-test-recovery" to Service {
command = "dispatch-service"
states = [ OK ]
types = [ Acknowledgement, Recovery ]
users = ["russ"]
period = "24x7"
vars.priority = "medium"
assign where "tests" in service.groups
interval = 1
}
apply Service "NotificationTest" {
enable_active_checks = true
check_command = "passive"
max_check_attempts = 1
ignore where host.vars.noservices == true
groups += ["tests"]
assign where host.name == "icinga2.acceleration.net"
max_check_attempts = 5
check_interval = 5m
retry_interval = 5m
}
此配置由 icinga 打印为:
~# icinga2 object list --name service-test-*
Object 'icinga2.acceleration.net!NotificationTest!service-test-down-1' of type 'Notification':
% declared in '/opt/icinga2lib/lib.conf.d//test.conf', lines 2:1-2:51
* __name = "icinga2.acceleration.net!NotificationTest!service-test-down-1"
* command = "dispatch-service"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 3:3-3:30
* command_endpoint = ""
* host_name = "icinga2.acceleration.net"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 2:1-2:51
* interval = 60
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 13:3-13:15
* name = "service-test-down-1"
* package = "_etc"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 2:1-2:51
* period = "24x7"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 8:3-8:17
* service_name = "NotificationTest"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 2:1-2:51
* states = [ "Warning", "Critical", "Unknown" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 4:3-4:41
* templates = [ "service-test-down-1" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 2:1-2:51
* times
* begin = 0
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 11:3-11:18
* end = 180
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 12:3-12:16
* type = "Notification"
* types = [ "Problem", "Custom", "FlappingStart", "FlappingEnd", "DowntimeStart", "DowntimeEnd", "DowntimeRemoved" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 5:3-6:57
* user_groups = null
* users = [ "russ" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 7:3-7:18
* vars
* priority = "medium"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 10:3-10:26
* zone = ""
Object 'icinga2.acceleration.net!NotificationTest!service-test-down-2' of type 'Notification':
% declared in '/opt/icinga2lib/lib.conf.d//test.conf', lines 16:1-16:51
* __name = "icinga2.acceleration.net!NotificationTest!service-test-down-2"
* command = "dispatch-service"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 17:3-17:30
* command_endpoint = ""
* host_name = "icinga2.acceleration.net"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 16:1-16:51
* interval = 120
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 27:3-27:15
* name = "service-test-down-2"
* package = "_etc"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 16:1-16:51
* period = "24x7"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 21:3-21:17
* service_name = "NotificationTest"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 16:1-16:51
* states = [ "Warning", "Critical", "Unknown" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 18:3-18:41
* templates = [ "service-test-down-2" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 16:1-16:51
* times
* begin = 180
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 25:3-25:18
* end = 43200
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 26:3-26:17
* type = "Notification"
* types = [ "Problem", "Custom", "FlappingStart", "FlappingEnd", "DowntimeStart", "DowntimeEnd", "DowntimeRemoved" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 19:3-20:57
* user_groups = null
* users = [ "russ" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 22:3-22:18
* vars
* priority = "medium"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 24:3-24:26
* zone = ""
Object 'icinga2.acceleration.net!NotificationTest!service-test-recovery' of type 'Notification':
% declared in '/opt/icinga2lib/lib.conf.d//test.conf', lines 29:1-29:53
* __name = "icinga2.acceleration.net!NotificationTest!service-test-recovery"
* command = "dispatch-service"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 30:3-30:30
* command_endpoint = ""
* host_name = "icinga2.acceleration.net"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 29:1-29:53
* interval = 1
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 37:3-37:14
* name = "service-test-recovery"
* package = "_etc"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 29:1-29:53
* period = "24x7"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 34:3-34:17
* service_name = "NotificationTest"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 29:1-29:53
* states = [ "OK" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 31:3-31:17
* templates = [ "service-test-recovery" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 29:1-29:53
* times = null
* type = "Notification"
* types = [ "Acknowledgement", "Recovery" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 32:3-32:39
* user_groups = null
* users = [ "russ" ]
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 33:3-33:18
* vars
* priority = "medium"
% = modified in '/opt/icinga2lib/lib.conf.d//test.conf', lines 35:3-35:26
* zone = ""
相关文档链接: https://www.icinga.com/docs/icinga2/latest/doc/03-monitoring-basics/#notification-escalations
icinga2 的开发者(在 github:https://github.com/Icinga/icinga2/issues/5478)提供的答案是,当您有通知升级时,无法从 icinga2 内部发送单个恢复通知。
每个升级都是一个单独的通知对象,每个通知问题的通知都将发送一条恢复消息。不会向任何从未发送过 PROBLEM 通知的通知对象发送任何 RECOVERY 通知(这似乎是错误的,但无论如何)。
建议的解决方案是使用通知代理为您删除重复的消息。鉴于不希望有状态代理,我做了一个函数来设置 current_escalation 在通知的主机/服务上,这样只有当前升级才会实际发送 RECOVERY 消息,我们的代理仍然可以是无状态的。 github.
上的示例代码