AWS CloudFormation 停留在 CREATE_IN_PROGRESS

AWS CloudFormation is stuck at CREATE_IN_PROGRESS

我正在使用 AWS Lambda 构建一个相当大的 REST API。语言是 node.js。有超过 200 个功能,而且还会有更多功能。这些函数中的每一个所做的是连接 RDS 数据库、获取数据或保存数据。

我正在使用 aws sam 工具部署它。下面是template.yaml。请注意,我只发布了一种方法,因为从外观上看,所有方法看起来都一样,除了它们指向的端点。

 WSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
  xxx-restapi

  Sample SAM Template for xxx-restapi
  
# More info about Globals: https://github.com/awslabs/serverless-application-model/blob/master/docs/globals.rst
Globals:
  Function:
    Timeout: 3   
    VpcConfig:
        SecurityGroupIds:
          - sg-041f2459dcd921e8e
        SubnetIds:
          - subnet-038xxx2d
          - subnet-c4dxxxcb
          - subnet-af5xxxc8

Resources:
  GetAllAccountingTypesFunction:
    Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
    Properties:
      CodeUri: xxx-restapi/
      Handler: source/accounting-types/accountingtypes-getall.getallaccountingtypes
      Runtime: nodejs14.x
      Events:
        GetAllAccountingTypesAPIEvent:
          Type: Api # More info about API Event Source: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#api
          Properties:
            Path: /accountingtypes/getall
            Method: get
  GetAccountingTypeByIDFunction:
    Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
    Properties:
      CodeUri: xxx-restapi/
      Handler: source/accounting-types/accountingtypes-byid.getbyid
      Runtime: nodejs14.x
      Events:
        GetAllAccountingTypesAPIEvent:
          Type: Api # More info about API Event Source: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#api
          Properties:
            Path: /accountingtypes/getbyid
            Method: get

LambdaRole:
    Type: 'AWS::IAM::Role'
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - lambda.amazonaws.com
            Action:
              - 'sts:AssumeRole'
      Path: /
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
      Policies:
        - PolicyName: root
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action:
                  - ec2:DescribeNetworkInterfaces
                  - ec2:CreateNetworkInterface
                  - ec2:DeleteNetworkInterface
                  - ec2:DescribeInstances
                  - ec2:AttachNetworkInterface
                Resource: '*'

Outputs:
  # ServerlessRestApi is an implicit API created out of Events key under Serverless::Function
  # Find out more about other implicit resources you can reference within SAM
  # https://github.com/awslabs/serverless-application-model/blob/master/docs/internals/generated_resources.rst#api
  HelloWorldApi:
    Description: "API Gateway endpoint URL for Prod stage for functions"
    Value: !Sub "https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/"

我的所有方法都很好,它们按预期工作。但是,当我尝试部署时,它卡在 CREATE_IN_PROGRESS。但是,如果我减少函数的数量并尝试,它会起作用。

我检查了 aws cloud trail 日志,发现如下内容。

ErrorCode: Client.RequestLimitExceeded
Resources: [{"resourceType":"AWS::EC2::SecurityGroup","resourceName":"sg-041f245xxxxd921e8e"},{"resourceType":"AWS::EC2::Subnet","resourceName":"subnet-af5xxxc8"}]

ErrorCode: Client.DryRunOperation
Resources: [{"resourceType":"AWS::EC2::SecurityGroup","resourceName":"sg-041f2459xxxx1e8e"},{"resourceType":"AWS::EC2::Subnet","resourceName":"subnet-axxxx3c8"}]

像上面这样的事件有多个。我该如何解决这个问题?

CloudFormation 可能一次创建了太多函数,因此您达到了限制。您可能会将此归类为 CloudFormation 中的错误,因此我认为您一定要将此报告给 AWS 或 CloudFormation 团队。

话虽这么说,但可能的解决方法是分步部署。在每次更新时添加一些 Lambda 函数。这将是一件非常麻烦的事情,但我看不到其他方法。

您始终可以通过使用嵌套堆栈(然后您可以一个一个地取消注释)来简化此过程。也许您甚至可以用它绕过整个节流限制,具体取决于 CloudFormation 如何处理它。但是我不确定。

如果您在单个堆栈中管理如此多的资源,您还有触及其他 CloudFormation 限制的危险(特别是因为 SAM 将多个资源抽象为一种类型)。因此,使用嵌套堆栈还可以防止您在(不久的)将来达到这些限制。