Azure Kubernetes 服务 (AKS) - Pod 重启警报
Azure Kubernetes Service (AKS) - Pod restart alert
我想在 pod 重新启动时创建警报规则。即如果 pod 在 30 分钟内重启两次 window
我有以下日志分析查询:
KubePodInventory
| where ServiceName == "xxxx"
| project PodRestartCount, TimeGenerated, ServiceName
| summarize AggregatedValue = count(PodRestartCount) by ServiceName, bin(TimeGenerated, 30m)
但在这种情况下将警报阈值设置为 2 将不起作用,因为 PodRestartCount 未重置。任何帮助将不胜感激。也许有一个更好的方法,我错过了。
要重置 BIN() 之间的计数,您可以在序列化输出上使用 prev() 函数来计算差异
KubePodInventory
| where ServiceName == "<service name>"
| where Namespace == "<namespace name>"
| summarize AggregatedPodRestarts = sum(PodRestartCount) by bin(TimeGenerated, 30m)
| serialize
| extend prevPodRestarts = prev(AggregatedPodRestarts,1)
| extend diff = AggregatedPodRestarts - prevPodRestarts
| where diff >= 2
这将在您的 BIN 期间输出正确的差异。
TimeGenerated [UTC] prevPodRestarts diff AggregatedPodRestarts
5/12/2020, 12:00:00.000 AM 1,368,477 191,364 1,559,841
5/11/2020, 11:00:00.000 PM 1,552,614 3,594 1,556,208
5/11/2020, 10:00:00.000 PM 182,217 1,370,397 1,552,614
参考:
https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/serializeoperator
https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/prevfunction
我想在 pod 重新启动时创建警报规则。即如果 pod 在 30 分钟内重启两次 window
我有以下日志分析查询:
KubePodInventory
| where ServiceName == "xxxx"
| project PodRestartCount, TimeGenerated, ServiceName
| summarize AggregatedValue = count(PodRestartCount) by ServiceName, bin(TimeGenerated, 30m)
但在这种情况下将警报阈值设置为 2 将不起作用,因为 PodRestartCount 未重置。任何帮助将不胜感激。也许有一个更好的方法,我错过了。
要重置 BIN() 之间的计数,您可以在序列化输出上使用 prev() 函数来计算差异
KubePodInventory
| where ServiceName == "<service name>"
| where Namespace == "<namespace name>"
| summarize AggregatedPodRestarts = sum(PodRestartCount) by bin(TimeGenerated, 30m)
| serialize
| extend prevPodRestarts = prev(AggregatedPodRestarts,1)
| extend diff = AggregatedPodRestarts - prevPodRestarts
| where diff >= 2
这将在您的 BIN 期间输出正确的差异。
TimeGenerated [UTC] prevPodRestarts diff AggregatedPodRestarts
5/12/2020, 12:00:00.000 AM 1,368,477 191,364 1,559,841
5/11/2020, 11:00:00.000 PM 1,552,614 3,594 1,556,208
5/11/2020, 10:00:00.000 PM 182,217 1,370,397 1,552,614
参考: https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/serializeoperator
https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/prevfunction