Datadog 中的 AWS Cloudwatch 警报

AWS Cloudwatch alarms in Datadog

查看 Datadog AWS 集成文档时,我发现提到 AWS 警报可以流式传输到 Datadog 中。在 Alarm collection 部分,您可以选择两种不同的方法将 AWS CloudWatch 警报发送到 Datadog Event Stream right here。 但是没有关于如何做到这一点或应该设置什么来做到这一点的进一步解释。此外,尝试 google 诸如“Datadog aws 警报轮询”之类的内容会给您一些其他功能的模糊描述,但不会对 AWS CloudWatch 警报进行描述。

我的问题是这可能吗?

到目前为止我尝试的是设置 DataDog Lambda 转发器,它将 CloudWatch 日志(我想也是指标和警报?)发送到 DD。我允许那个 lambda。我创建了一些 AWS 指标过滤器和 AWS 警报以在特定事件发生时触发。我 运行 一些 lambda 代码抛出异常并触发 CloudWatch 警报以更改其状态。

我清楚地看到 DD 中的 lambda 日志,但我在 DD 事件中找不到与我的警报相关的任何内容。我想这不是 DD-AWS 集成的问题,因为我们在大型组织中使用它,而且它在我加入公司之前很久就已经配置好了。 我做错了什么?

下面的 Cloudformation 脚本(我删除了一些部分,所以它不能正常工作)

Resources:
  DatadogForwarderLambda:
    Type: AWS::Lambda::Function
    Properties:
      Description: Pushes logs, metrics and traces from AWS to Datadog.
      Role: !GetAtt "DatadogForwarderLambdaRole.Arn"
      Handler: lambda_function.lambda_handler
      Code:
        S3Bucket: config-sandbox
        S3Key: 'aws-dd-forwarder-3.38.0.zip'
      MemorySize: 1024
      Runtime: python3.7
      Timeout: 120
      Tags:
        - Key: "dd_forwarder_version"
          Value:  3.38.0
      Environment:
        Variables:
          DD_ENHANCED_METRICS: "false"
          DD_API_KEY_SECRET_ARN: 
            Ref: DdApiKeySecret
          DD_S3_BUCKET_NAME: config-sandbox
          DD_SITE: datadoghq.com
          DD_: datadoghq.com
          DD_TAGS_CACHE_TTL_SECONDS: 300
          DD_FETCH_LAMBDA_TAGS: true
          DD_USE_TCP: false
          DD_NO_SSL: false
          REDACT_IP: false
          REDACT_EMAIL: false
          DD_USE_PRIVATE_LINK: false
          DD_USE_VPC: false
      ReservedConcurrentExecutions: 100


  DatadogReadonlyPolicy:
    Type: 'AWS::IAM::Policy'
    Properties:
      PolicyName: !Sub "DatadogReadonlyPolicy"
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Action:
              - 'cloudwatch:Get*'
              - 'cloudwatch:List*'
              - 'cloudwatch:DescribeAlarmHistory'
              - 'cloudtrail:LookupEvents'
              - 'ec2:Describe'
              - 's3:GetObject'
              - 's3:PutObject'
              - 's3:DeleteObject'
              - 's3:ListBucket'
              - 'lambda:List*'
              - 'tag:GetResources'
              - 'tag:GetTagKeys'
              - 'tag:GetTagValues'
              - 'support:*'
            Resource: !GetAtt DatadogForwarderLambda.Arn
          - Effect: Allow
            Action:
              - secretsmanager:GetSecretValue
            Resource:
              - Ref: DdApiKeySecret
      Roles: 
        - !Ref DatadogForwarderLambdaRole
       

  DatadogForwarderLambdaRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - lambda.amazonaws.com
              AWS:
                - Fn::Sub:
                  - "arn:aws:iam::${AccountId}:role/human-role/some-role-name"
                  - { AccountId: !Ref 'AWS::AccountId' }
            Action:
              - sts:AssumeRole    
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
        - arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole
      Path: /
      PermissionsBoundary:
        Fn::Join:
              - ''
              - - 'arn:aws:iam::'
                - Ref: AWS::AccountId
                - ':policy/some-organisation-permission-boundary'
      RoleName:               
        Fn::Sub:
        - 'a${AIID}-dd-forwarder-lambda-${StackID}'
        - { StackID: !Select [4, !Split ["-", !Ref 'AWS::StackId']],
            AIID: !Ref AIID }


  IncomingQueueHasMessagesExceptionAlarm:
    Type: AWS::CloudWatch::Alarm
    Properties:
      AlarmDescription: Incoming queue has unprocessed messages, new processing round can't be started
      AlarmName: !Sub "IncomingQueueHasMessagesExceptionAlarm"
      ComparisonOperator: GreaterThanThreshold
      Threshold: 0 # no messages are allowed in queue if new round started
      EvaluationPeriods: 1
      Period: 10  
      Namespace: dev-logs
      MetricName: QueueHasMessagesException
      Statistic: Sum   
      TreatMissingData: missing


  IncomingQueueHasMessagesExceptionMetricFilter: 
    Type: AWS::Logs::MetricFilter
    Properties: 
      LogGroupName: 
        !Sub '/aws/lambda/${SomeLambdaName}'
      FilterPattern: "QueueHasMessagesException"
      MetricTransformations: 
        - 
          MetricNamespace: dev-logs
          MetricName: QueueHasMessagesException
          MetricValue: 1
  

最后我发现我的AWS账户并没有完全集成到DD中。