基于 Kinesis 指标的 Cloudwatch 警报在值超过阈值时未触发

Question

问题描述

我们有一个 AWS Cloudwatch 警报，它非常明显地超过了正在监控的指标图中指示的阈值线，但它没有触发。

这是怎么回事？警报如何明显超过阈值超过其周期和评估时间而不触发？

警报配置和清空历史记录

Answer 1

如果我们查看闹钟的设置，会发现两件非常有趣的事情。

第一个有趣的事情是警报处于连续线图的Insufficient Data状态。

二是报警设置为秒为单位，上图显示为毫秒。事实上，如果我们为迭代器 age

列出一组指标

aws cloudwatch get-metric-statistics --namespace "AWS/Lambda" --metric-name "IteratorAge" --dimensions Name=FunctionName,Value=prod-pipeline-rules-exec --statistics Maximum --start-time $(gdate -u -d '20 minutes ago' +%Y-%m-%dT%TZ) --end-time $(gdate -u +%Y-%m-%dT%TZ) --period 60 --region <region>
    [
        {
            "Timestamp": "2019-12-18T01:43:00Z",
            "Maximum": 2327.0,
            "Unit": "Milliseconds"
        },
        {
            "Timestamp": "2019-12-18T01:25:00Z",
            "Maximum": 2188.0,
            "Unit": "Milliseconds"
        },
        {
            "Timestamp": "2019-12-18T01:34:00Z",
            "Maximum": 2459.0,
            "Unit": "Milliseconds"
        }
    ]

单位为毫秒。

不幸的是，Cloudwatch 会将单位不匹配视为缺失数据，这将导致您的警报永远不会触发。

基于 Kinesis 指标的 Cloudwatch 警报在值超过阈值时未触发

Cloudwatch Alarm Based on A Kinesis Metric not Triggered When Value is Over The Threshold

amazon-cloudwatch

amazon-kinesis

cloudwatch-alarms

问题描述

警报配置和清空历史记录