Azure Kubernetes 服务 (AKS) - Pod 重启警报

Azure Kubernetes Service (AKS) - Pod restart alert

我想在 pod 重新启动时创建警报规则。即如果 pod 在 30 分钟内重启两次 window

我有以下日志分析查询:

KubePodInventory
| where ServiceName == "xxxx"
| project PodRestartCount, TimeGenerated, ServiceName
| summarize AggregatedValue = count(PodRestartCount) by ServiceName, bin(TimeGenerated, 30m) 

但在这种情况下将警报阈值设置为 2 将不起作用,因为 PodRestartCount 未重置。任何帮助将不胜感激。也许有一个更好的方法,我错过了。

要重置 BIN() 之间的计数,您可以在序列化输出上使用 prev() 函数来计算差异

KubePodInventory
| where ServiceName == "<service name>" 
| where Namespace == "<namespace name>"
| summarize AggregatedPodRestarts = sum(PodRestartCount) by bin(TimeGenerated, 30m) 
| serialize
| extend prevPodRestarts = prev(AggregatedPodRestarts,1)
| extend diff = AggregatedPodRestarts - prevPodRestarts
| where diff >= 2

这将在您的 BIN 期间输出正确的差异。

TimeGenerated [UTC]         prevPodRestarts diff        AggregatedPodRestarts
5/12/2020, 12:00:00.000 AM  1,368,477       191,364     1,559,841   
5/11/2020, 11:00:00.000 PM  1,552,614       3,594       1,556,208   
5/11/2020, 10:00:00.000 PM  182,217         1,370,397   1,552,614

参考: https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/serializeoperator

https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/prevfunction