Windows AWS 中的服务应用程序多可用区高可用性

Windows Service Application Multi-AZ High Availability in AWS

我在 AWS 的 windows EC2 上有一个 C#.NET windows 服务应用程序 运行。使用多可用区使其高度可用的最佳模式是什么(例如,如果可用区出现故障,第二个 windows 服务将如何启动?)

您可以使用锁定机制来防止两个实例同时工作。 Redis 可能是一个不错的选择,使用 https://github.com/samcook/RedLock.net

AWS中的可用区是一组数据中心,一个AZ中的所有数据中心同时宕机是不常见的。如果您同意您的 EC2 windows 实例留在单个可用区中,您可以在您的 EC2 实例上设置 Auto Recovery。如果现有 EC2 实例或服务器受损,这将在健康主机上恢复您的 EC2 实例。

以防万一 AZ 级中断不太可能发生,你真的想跨越 AZ。您可以设置 EC2 Auto scaling 并选择多个可用区。您可以选择将最小和最大实例计数保持为 1。这样您就可以设置 Auto Scaling 来跨 AZ 管理您的 EC2 实例,同时告诉它在任何时候都不要创建超过 1 个实例。如果您在现有 AZ 中的实例变得不可用,并且如果您的实例所在的整个 AZ 运行 变得不可用,Auto Scaling 服务将在其他 AZ 之一中启动您的单个实例。不过,这对 Windows 服务来说有点矫枉过正。这种模式对于 Application Load Balancer 后面的多个 Web 服务器很有意义。正如我所说,AWS AZ 是一组多个数据中心,因此 Auto Recovery 对于单个 EC2 实例来说绰绰有余。

如果 Windows 服务运行时间 < 15 分钟,将其转换为 Lambda - 这样它将以毫秒为单位进行扩展,只需为使用的执行时间付费(而不是 24x7),而无需打补丁。如果没有,您可以考虑批处理服务或最后使用 EC2s。

您将希望使用 AWS SDK、AWS CLI、AWS CDK 或 Cloud Formation 自动部署 AWS 中的所有内容。

CloudFormation 是最容易设置的,此示例在 ASG 中创建一个 EC2,其最小值和最大值等于 1。这意味着如果 EC2 实例出现故障,ASG 将检测到(它称为扩展事件) 并且因为最小值是 1,所以它会启动另一个实例。

我提供了一个 UserData 示例,用于下载可以安装您的服务的 powershell 脚本,这需要很长时间,我建议您烘焙 AMI 并在 CloudFormation (CFN) 模板中使用它。我建议您上传此 CFN 模板并尝试终止实例,一分钟后 ASG 将启动一个新实例。唯一的方法是删除 ASG 以删除那个实例!

将“您的Windows服务”替换为您的服务名称并指定 VPC:

AWSTemplateFormatVersion: 2010-09-09
Description: YourWindowsService deployment script

Parameters:
  Environment:
    Description: The environment we deploy to
    Type: String
    Default: 'nonprod'
    AllowedValues: ['nonprod','prod']
  KeyPairName:
    Description: >-
      Mandatory. Enter a Public/private key pair. If you do not have one in this region,
      please create it before continuing
    Type: 'AWS::EC2::KeyPair::KeyName'
    Default: YourKeyPairName
  Subnet1ID:
    Description: 'ID of the subnet 1 for auto scaling group'
    Type: 'AWS::EC2::Subnet::Id'
  Subnet2ID:
    Description: 'ID of the subnet 2 for auto scaling group'
    Type: 'AWS::EC2::Subnet::Id'
  RemoteAccessCIDR:
    AllowedPattern: >-
      ^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/([0-9]|[1-2][0-9]|3[0-2]))$
    ConstraintDescription: CIDR block parameter must be in the form x.x.x.x/x
    Description: Allowed CIDR block for external SSH access to YourWindowsServices IIS WS
    Type: String
    Default: 0.0.0.0/0
  AppName:
    Type: String
    Description: Application Name
    Default: "YourWindowsService"
  InstanceProfile:
    Type: String
    Default: YourIAMProvisioningProfileOrEc2FullAccess
  AMI:
    Type: 'AWS::EC2::Image::Id'


Mappings:
  EnvironmentToInstanceType:    
    nonprod:
      instanceType: t3.medium
    prod:
      instanceType: t3.large

  EnvironmentToVPC:
    nonprod:
      VPC: vpc-................
    prod:
      VPC: something.cant.be.null

  DisableTerminate:
    prod:
      YesorNo: 'false'
    nonprod:
      YesorNo: 'false'

Conditions:
  CreateNonProdResources: !Equals [ !Ref Environment, prod ]
  CreateProdResources: !Equals [ !Ref Environment, nonprod ]

Resources:
  YourWindowsServiceMainLogGroup:
    Type: 'AWS::Logs::LogGroup'
  SSHMetricFilter:
    Type: 'AWS::Logs::MetricFilter'
    Properties:
      LogGroupName: !Ref YourWindowsServiceMainLogGroup
      FilterPattern: ON FROM USER PWD
      MetricTransformations:
        - MetricName: SSHCommandCount
          MetricValue: 1
          MetricNamespace: !Join
            - /
            - - AWSQuickStart
              - !Ref 'AWS::StackName'
  YourWindowsServiceAutoScalingGroup:
    Type: 'AWS::AutoScaling::AutoScalingGroup'
    Properties:
      LaunchConfigurationName: !Ref YourWindowsServiceLaunchConfiguration
      AutoScalingGroupName:  !Join
            - '.'
            - - !Ref 'AWS::StackName'
              - 'ASG'
      VPCZoneIdentifier:
        - !Ref Subnet1ID
        - !Ref Subnet2ID
      MinSize: 1
      MaxSize: 1
      Cooldown: '300'
      DesiredCapacity: 1
      Tags:
        - Key: Name
          Value: YourWindowsPCName
          PropagateAtLaunch: 'true'
          
  YourWindowsServiceLaunchConfiguration:
    Type: 'AWS::AutoScaling::LaunchConfiguration'
    Properties:
      AssociatePublicIpAddress: 'false'
      PlacementTenancy: default
      KeyName: !Ref KeyPairName
      ImageId: !Ref AMI

      SecurityGroups:
        - Fn::If: [CreateNonProdResources, !Ref YourWindowsServiceNonProdSecurityGroup, !Ref "AWS::NoValue"]
        - Fn::If: [CreateProdResources, !Ref YourWindowsServiceProdSecurityGroup, !Ref "AWS::NoValue"]

      IamInstanceProfile: !Ref InstanceProfile
      InstanceType: !FindInMap [EnvironmentToInstanceType, !Ref 'Environment', instanceType]
      UserData:
        !Base64 |
          <powershell>
          #setup an install folder
          $path = "C:\temp\"
          If(!(test-path $path))
          {
            New-Item -ItemType Directory -Force -Path $path
          }
          cd $path

          #using S3 (ie the AWS API) fetch the powershell install script and execute it
          $S3BucketName = "a-unique-bucket"
          $bootstrap = "install-env.ps1"
          $script = ($path + $bootstrap)
          Set-DefaultAWSRegion -Region ap-southeast-2
          Copy-S3Object -BucketName $S3BucketName -key $bootstrap -LocalFile ($path + $bootstrap)
          & $script
          </powershell>
          <persist>true</persist>
          <runAsLocalSystem>true</runAsLocalSystem>

  YourWindowsServiceNonProdSecurityGroup:
    Type: 'AWS::EC2::SecurityGroup'
    Condition: CreateNonProdResources
    Properties:
      GroupDescription: Enables access to YourWindowsService
      GroupName: !Join
            - '.'
            - - !Ref 'AWS::StackName'
              - 'SG'
      VpcId: !FindInMap [EnvironmentToVPC, !Ref 'Environment', VPC]
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: '22'
          ToPort: '22'
          CidrIp: !Ref RemoteAccessCIDR
        - IpProtocol: icmp
          FromPort: '-1'
          ToPort: '-1'
          CidrIp: !Ref RemoteAccessCIDR
        - IpProtocol: tcp
          FromPort: '80'
          ToPort: '80'
          CidrIp: !Ref RemoteAccessCIDR
        - IpProtocol: tcp
          FromPort: '3389'
          ToPort: '3389'
          CidrIp: !Ref RemoteAccessCIDR
        - IpProtocol: udp
          FromPort: '3389'
          ToPort: '3389'
          CidrIp: !Ref RemoteAccessCIDR

  YourWindowsServiceProdSecurityGroup:
    Type: 'AWS::EC2::SecurityGroup'
    Condition: CreateProdResources
    Properties:
      GroupDescription: Enables access to YourWindowsService
      GroupName: !Join
            - '.'
            - - !Ref 'AWS::StackName'
              - 'SG'
      VpcId: !FindInMap [EnvironmentToVPC, !Ref 'Environment', VPC]
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: '22'
          ToPort: '22'
          CidrIp: !Ref RemoteAccessCIDR
        - IpProtocol: icmp
          FromPort: '-1'
          ToPort: '-1'
          CidrIp: !Ref RemoteAccessCIDR
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: '80'
          ToPort: '80'
          CidrIp: !Ref RemoteAccessCIDR
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: '3389'
          ToPort: '3389'
          CidrIp: !Ref RemoteAccessCIDR
        - IpProtocol: udp
          FromPort: '3389'
          ToPort: '3389'
          CidrIp: !Ref RemoteAccessCIDR