如何使用 cloudwatch 监控 EMR 集群是否终止

How to monitor a EMR cluster whether it is terminated using cloudwatch

我想设置警报,当任何 EMR 集群终止(由内部错误引起)时,我知道有一个“IsIdle”选项,但我的 EMR 集群设计为持久性,所以“IsIdle”不是真的很适合我的情况。有没有我可以使用的健康检查指标?

您可以配置 Amazon CloudWatch to send a "State Change" event to another service like an AWS Lambda function or an Amazon SNS 主题。

为此,打开 CloudWatch console,在导航窗格中单击规则 > 创建规则。

  1. 服务名称:EMR
  2. 事件类型:状态变化
  3. 特定详细信息类型:EMR 集群状态更改
  4. 特定状态:已终止且TERMINATED_WITH_ERRORS
  5. Targets: 放置您选择的接收服务。

下面是此类事件的示例:

{
  "version": "0",
  "id": "8535abb0-f87e-4640-b7b6-8de000dfc30a",
  "detail-type": "EMR Cluster State Change",
  "source": "aws.emr",
  "account": "123456789012",
  "time": "2016-12-16T21:00:23Z",
  "region": "us-east-1",
  "resources": [],
  "detail": {
    "severity": "INFO",
    "stateChangeReason": "{\"code\":\"USER_REQUEST\",\"message\":\"Terminated by user request\"}",
    "name": "Development Cluster",
    "clusterId": "j-1YONHTCP3YZKC",
    "state": "TERMINATED",
    "message": "Amazon EMR Cluster j-1YONHTCP3YZKC (Development Cluster) has terminated at 2016-12-16 21:00 UTC with a reason of USER_REQUEST."
  }
}