卷 IOPS、卷吞吐量 (MiB/s) 和网络 (Gbps) 的 CloudWatch 指标
CloudWatch Metrics for Volume IOPS, Volume Throughput (MiB/s) and Network (Gbps)
我不得不在 AWS 对一个应用程序进行故障排除,并且使用所有 CloudWatch 指标图来解释环境健康状况并不容易,所以我决定在这里分享我的经验。
CloudWatch 为我们提供 CPU、内存*、磁盘和网络指标。
* to get memory metrics you need to install CloudWatch Agent.
CPU 和 Memory 以百分比形式为我们提供了度量标准,这很容易解释。
但是磁盘和网络并不是那么容易,例如我想检查我的卷和网络 (Gbps) 的 IOPS 和吞吐量 (MiB/s)。
我需要这些值,因为 AWS 将 EBS 限制定义为 IOPS 和吞吐量 (MB/s),将实例网络限制定义为 Gbps。
总 IOPS
EBS Volume 为我们提供指标 VolumeReadOps
和 VolumeWriteOps
。让我引用 AWS 文档。
VolumeReadOps
- The total number of read operations in a specified period of time.
To calculate the average read operations per second (read IOPS) for the period, divide the total read operations in the period by the number of seconds in that period.
VolumeWriteOps
- The total number of write operations in a specified period of time.
To calculate the average write operations per second (write IOPS) for the period, divide the total write operations in the period by the number of seconds in that period.
参考:https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using_cloudwatch_ebs.html
要获得总 IOPS,我们需要 (VolumeReadOps + VolumeWriteOps) / SecondsInPeriod
。
幸运的是 CloudWatch 帮助我们 Expression
。使用下面的表达式,函数 PERIOD
是我们的朋友。
m1 = VolumeWriteOps - Sum
m2 = VolumeReadOps - Sum
Expression: (m1+m2)/PERIOD(m1)
参考:https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/using-metric-math.html
总吞吐量 (MiB/s)
EBS Volume 为我们提供指标 VolumeReadBytes
和 VolumeWriteBytes
。让我引用 AWS 文档。
VolumeReadBytes
- Provides information on the read operations in a specified period of time. The Sum
statistic reports the total number of bytes transferred during the period.
VolumeWriteBytes
- Provides information on the write operations in a specified period of time. The Sum
statistic reports the total number of bytes transferred during the period.
这两个指标给我们的值都在 bytes
中,但我们希望它们在 MiB
中,所以要转换我们需要除以 1048576
,这是 [= 的结果31=]。让我详细解释一下。
1024 bytes = 1 KiB
1024 KiB = 1 MiB
要获得 MiB/s
中的总吞吐量,我们需要 ((VolumeReadBytes + VolumeWriteBytes) / 1048576) / SecondsInPeriod
。
使用下面的表达式,函数 PERIOD
是我们的朋友。
m1 = VolumeWriteBytes - Sum
m2 = VolumeReadBytes - Sum
Expression: ((m1+m2)/1048576)/PERIOD(m1)
总网络 (Gbps)
EC2 实例为我们提供指标 NetworkIn
和 NetworkOut
。让我引用 AWS 文档。
NetworkIn
- The number of bytes received on all network interfaces by the instance. This metric identifies the volume of incoming network traffic to a single instance.
The number reported is the number of bytes received during the period. If you are using basic (five-minute) monitoring, you can divide this number by 300 to find Bytes/second. If you have detailed (one-minute) monitoring, divide it by 60.
NetworkOut
- The number of bytes sent out on all network interfaces by the instance. This metric identifies the volume of outgoing network traffic from a single instance.
The number reported is the number of bytes sent during the period. If you are using basic (five-minute) monitoring, you can divide this number by 300 to find Bytes/second. If you have detailed (one-minute) monitoring, divide it by 60.
参考:https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/viewing_metrics_with_cloudwatch.html
这两个指标都为我们提供了 bytes per period
中的值,但我们希望它们在 gigabits / second
中。
要从 "period" 转换为 "second",我们只需要除以 300
(因为我使用的是标准监控)。
要从 bytes
转换为 gigabits
,我们需要除以 0.008
,这是 (1000 / 1000 / 1000) * 8
的结果。让我详细解释一下。
1000 bits = 1 kilobits
1000 kilobits = 1 megabits
1000 megabits = 1 gigabits
1 byte = 8 bits
要在 Gbps
中获得总网络,我们需要 ((NetworkIn + NetworkOut) / 300) / 0.008
。
m1 = NetworkIn - Sum
m2 = NetworkOut - Sum
Expression: ((m1+m2)/300)/0.008
我不得不在 AWS 对一个应用程序进行故障排除,并且使用所有 CloudWatch 指标图来解释环境健康状况并不容易,所以我决定在这里分享我的经验。
CloudWatch 为我们提供 CPU、内存*、磁盘和网络指标。
* to get memory metrics you need to install CloudWatch Agent.
CPU 和 Memory 以百分比形式为我们提供了度量标准,这很容易解释。 但是磁盘和网络并不是那么容易,例如我想检查我的卷和网络 (Gbps) 的 IOPS 和吞吐量 (MiB/s)。
我需要这些值,因为 AWS 将 EBS 限制定义为 IOPS 和吞吐量 (MB/s),将实例网络限制定义为 Gbps。
总 IOPS
EBS Volume 为我们提供指标 VolumeReadOps
和 VolumeWriteOps
。让我引用 AWS 文档。
VolumeReadOps
- The total number of read operations in a specified period of time.
To calculate the average read operations per second (read IOPS) for the period, divide the total read operations in the period by the number of seconds in that period.
VolumeWriteOps
- The total number of write operations in a specified period of time.
To calculate the average write operations per second (write IOPS) for the period, divide the total write operations in the period by the number of seconds in that period.
参考:https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using_cloudwatch_ebs.html
要获得总 IOPS,我们需要 (VolumeReadOps + VolumeWriteOps) / SecondsInPeriod
。
幸运的是 CloudWatch 帮助我们 Expression
。使用下面的表达式,函数 PERIOD
是我们的朋友。
m1 = VolumeWriteOps - Sum
m2 = VolumeReadOps - Sum
Expression: (m1+m2)/PERIOD(m1)
参考:https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/using-metric-math.html
总吞吐量 (MiB/s)
EBS Volume 为我们提供指标 VolumeReadBytes
和 VolumeWriteBytes
。让我引用 AWS 文档。
VolumeReadBytes
- Provides information on the read operations in a specified period of time. TheSum
statistic reports the total number of bytes transferred during the period.
VolumeWriteBytes
- Provides information on the write operations in a specified period of time. TheSum
statistic reports the total number of bytes transferred during the period.
这两个指标给我们的值都在 bytes
中,但我们希望它们在 MiB
中,所以要转换我们需要除以 1048576
,这是 [= 的结果31=]。让我详细解释一下。
1024 bytes = 1 KiB
1024 KiB = 1 MiB
要获得 MiB/s
中的总吞吐量,我们需要 ((VolumeReadBytes + VolumeWriteBytes) / 1048576) / SecondsInPeriod
。
使用下面的表达式,函数 PERIOD
是我们的朋友。
m1 = VolumeWriteBytes - Sum
m2 = VolumeReadBytes - Sum
Expression: ((m1+m2)/1048576)/PERIOD(m1)
总网络 (Gbps)
EC2 实例为我们提供指标 NetworkIn
和 NetworkOut
。让我引用 AWS 文档。
NetworkIn
- The number of bytes received on all network interfaces by the instance. This metric identifies the volume of incoming network traffic to a single instance.
The number reported is the number of bytes received during the period. If you are using basic (five-minute) monitoring, you can divide this number by 300 to find Bytes/second. If you have detailed (one-minute) monitoring, divide it by 60.
NetworkOut
- The number of bytes sent out on all network interfaces by the instance. This metric identifies the volume of outgoing network traffic from a single instance.
The number reported is the number of bytes sent during the period. If you are using basic (five-minute) monitoring, you can divide this number by 300 to find Bytes/second. If you have detailed (one-minute) monitoring, divide it by 60.
参考:https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/viewing_metrics_with_cloudwatch.html
这两个指标都为我们提供了 bytes per period
中的值,但我们希望它们在 gigabits / second
中。
要从 "period" 转换为 "second",我们只需要除以 300
(因为我使用的是标准监控)。
要从 bytes
转换为 gigabits
,我们需要除以 0.008
,这是 (1000 / 1000 / 1000) * 8
的结果。让我详细解释一下。
1000 bits = 1 kilobits
1000 kilobits = 1 megabits
1000 megabits = 1 gigabits
1 byte = 8 bits
要在 Gbps
中获得总网络,我们需要 ((NetworkIn + NetworkOut) / 300) / 0.008
。
m1 = NetworkIn - Sum
m2 = NetworkOut - Sum
Expression: ((m1+m2)/300)/0.008