运行 AWS ECS 容器的 ELB 健康检查失败
ELB Health Checks Failing with running AWS ECS container
我目前正在尝试通过 CloudFormation 模板将应用程序部署到 AWS ECS。 docker 图像存储在 AWS ECR 中,并部署到应用程序负载均衡器前面的 ECS 服务中。
我的服务启动了,我的负载均衡器也创建了,但是 ECS 服务中的任务反复失败并出现错误:
Task failed ELB health checks in (target-group arn:aws:elasticloadbalancing:us-east-1:...
我已经检查了我的安全组——ECS 服务安全组包括负载均衡器安全组,并且负载均衡器已成功创建。
我已经手动尝试在 ECR 上拉取我的图像 运行 它 - 没有问题。我错过了什么?我的模板如下。
Resources:
ECSRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: [ecs.amazonaws.com]
Action: ['sts:AssumeRole']
Path: /
Policies:
- PolicyName: ecs-service
PolicyDocument:
Statement:
- Effect: Allow
Action:
# Rules which allow ECS to attach network interfaces to instances
# on your behalf in order for awsvpc networking mode to work right
- 'ec2:AttachNetworkInterface'
- 'ec2:CreateNetworkInterface'
- 'ec2:CreateNetworkInterfacePermission'
- 'ec2:DeleteNetworkInterface'
- 'ec2:DeleteNetworkInterfacePermission'
- 'ec2:Describe*'
- 'ec2:DetachNetworkInterface'
# Rules which allow ECS to update load balancers on your behalf
# with the information sabout how to send traffic to your containers
- 'elasticloadbalancing:DeregisterInstancesFromLoadBalancer'
- 'elasticloadbalancing:DeregisterTargets'
- 'elasticloadbalancing:Describe*'
- 'elasticloadbalancing:RegisterInstancesWithLoadBalancer'
- 'elasticloadbalancing:RegisterTargets'
Resource: '*'
# This is a role which is used by the ECS tasks themselves.
ECSTaskExecutionRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: [ecs-tasks.amazonaws.com]
Action: ['sts:AssumeRole']
Path: /
Policies:
- PolicyName: AmazonECSTaskExecutionRolePolicy
PolicyDocument:
Statement:
- Effect: Allow
Action:
# Allow the ECS Tasks to download images from ECR
- 'ecr:GetAuthorizationToken'
- 'ecr:BatchCheckLayerAvailability'
- 'ecr:GetDownloadUrlForLayer'
- 'ecr:BatchGetImage'
# Allow the ECS tasks to upload logs to CloudWatch
- 'logs:CreateLogStream'
- 'logs:PutLogEvents'
Resource: '*'
TaskDef:
Type: AWS::ECS::TaskDefinition
Properties:
Cpu: 4096
Memory: 30720
ContainerDefinitions:
- Image: !Ref ECRImageUrl
Name: !Sub "${ProjectName}-ecsContainer"
PortMappings:
- ContainerPort: 4000
HostPort: 4000
Protocol: tcp
Family: !Sub "${ProjectName}-taskDef"
ExecutionRoleArn: !Ref ECSTaskExecutionRole
RequiresCompatibilities:
- FARGATE
NetworkMode: awsvpc
Cluster:
Type: AWS::ECS::Cluster
Properties:
ClusterName: !Sub "${ProjectName}-ECSCluster"
Service:
Type: AWS::ECS::Service
DependsOn:
- LoadBalancerListener
Properties:
Cluster: !Ref Cluster
DesiredCount: 2
LaunchType: FARGATE
ServiceName: !Sub "${ProjectName}-ECSService"
TaskDefinition: !Ref TaskDef
NetworkConfiguration:
AwsvpcConfiguration:
SecurityGroups:
- !Ref FargateContainerSecurityGroup
AssignPublicIp: ENABLED
Subnets: !Split [',', {'Fn::ImportValue': !Sub '${VPCStackName}-PublicSubnets'}]
LoadBalancers:
- ContainerName: !Sub "${ProjectName}-ecsContainer"
ContainerPort: 4000
TargetGroupArn: !Ref TargetGroup
FargateContainerSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Access to the Fargate containers
VpcId:
Fn::ImportValue:
!Sub '${VPCStackName}-VPC'
EcsSecurityGroupIngressFromPublicALB:
Type: AWS::EC2::SecurityGroupIngress
Properties:
Description: Ingress from the public ALB
GroupId: !Ref 'FargateContainerSecurityGroup'
IpProtocol: -1
SourceSecurityGroupId: !Ref 'PublicLoadBalancerSG'
EcsSecurityGroupIngressFromSelf:
Type: AWS::EC2::SecurityGroupIngress
Properties:
Description: Ingress from other containers in the same security group
GroupId: !Ref 'FargateContainerSecurityGroup'
IpProtocol: -1
SourceSecurityGroupId: !Ref 'FargateContainerSecurityGroup'
PublicLoadBalancerSG:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Access to the public facing load balancer
VpcId:
Fn::ImportValue:
!Sub '${VPCStackName}-VPC'
SecurityGroupIngress:
- CidrIp: 0.0.0.0/0
IpProtocol: -1
ACMCertificate:
Type: AWS::CertificateManager::Certificate
Properties:
DomainName: !Sub ${ProjectName}.${DomainName}
ValidationMethod: DNS
TargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
DependsOn:
- LoadBalancer
Properties:
TargetType: ip
Name: !Sub "${ProjectName}-ECSService"
Port: 4000
Protocol: HTTP
VpcId:
Fn::ImportValue:
!Sub '${VPCStackName}-VPC'
LoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Scheme: internet-facing
Subnets: !Split [',', {'Fn::ImportValue': !Sub '${VPCStackName}-PublicSubnets'}]
SecurityGroups:
- !Ref PublicLoadBalancerSG
LoadBalancerListener:
Type: AWS::ElasticLoadBalancingV2::Listener
DependsOn:
- LoadBalancer
Properties:
DefaultActions:
- TargetGroupArn: !Ref TargetGroup
Type: 'forward'
LoadBalancerArn: !Ref LoadBalancer
Port: 443
Protocol: HTTP
健康检查功能自动在端口 80 调用 / 并期望 200 状态代码作为响应。它在 EC2->target groups -> your ecs target group 中可用。你必须确保你的端口是 4000 并且在健康检查中调整默认路径和响应状态代码。
此外,您始终可以尝试在您正在使用的端口 4000 上使用 public ip 或 DNS 连接到您的 ec2 实例,看看是否可行。
如果 ec2 实例无法在端口 4000 上运行,请对 docker 部署进行故障排除。会谈定义或参数有问题。
如果目标组 trlargets 或健康检查配置出现问题。
希望这对您有所帮助。
事实证明,我的安全组不够宽松。来自网络负载均衡器的流量被视为来自其原始来源,因此如果您的 NLB 对所有流量开放,那么您的 Fargate 容器也应该如此。这解决了我的问题:
FargateContainerSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Access to the Fargate containers
VpcId:
Fn::ImportValue:
!Sub '${VPCStackName}-VPC'
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: !Ref ApplicationPort
ToPort: !Ref ApplicationPort
CidrIp: 0.0.0.0/0
在经历了很多痛苦之后,我发现 ALB 本身需要与安全组 (SG) 相关联,该安全组允许 ECS 动态分配的端口上的流量。您应该自动定义一个 SG 来定义这些端口范围。将此 SG 与您的 ALB 相关联,您的健康检查将开始通过(假设其他一切都正确连接)。
此外,确保您的任务定义将网络模式设置为“网桥”,并将“hostPort”值设置为 0——这指示 ECS 在底层 EC2 实例上动态分配端口并映射它到你的集装箱港口。
我目前正在尝试通过 CloudFormation 模板将应用程序部署到 AWS ECS。 docker 图像存储在 AWS ECR 中,并部署到应用程序负载均衡器前面的 ECS 服务中。
我的服务启动了,我的负载均衡器也创建了,但是 ECS 服务中的任务反复失败并出现错误:
Task failed ELB health checks in (target-group arn:aws:elasticloadbalancing:us-east-1:...
我已经检查了我的安全组——ECS 服务安全组包括负载均衡器安全组,并且负载均衡器已成功创建。
我已经手动尝试在 ECR 上拉取我的图像 运行 它 - 没有问题。我错过了什么?我的模板如下。
Resources:
ECSRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: [ecs.amazonaws.com]
Action: ['sts:AssumeRole']
Path: /
Policies:
- PolicyName: ecs-service
PolicyDocument:
Statement:
- Effect: Allow
Action:
# Rules which allow ECS to attach network interfaces to instances
# on your behalf in order for awsvpc networking mode to work right
- 'ec2:AttachNetworkInterface'
- 'ec2:CreateNetworkInterface'
- 'ec2:CreateNetworkInterfacePermission'
- 'ec2:DeleteNetworkInterface'
- 'ec2:DeleteNetworkInterfacePermission'
- 'ec2:Describe*'
- 'ec2:DetachNetworkInterface'
# Rules which allow ECS to update load balancers on your behalf
# with the information sabout how to send traffic to your containers
- 'elasticloadbalancing:DeregisterInstancesFromLoadBalancer'
- 'elasticloadbalancing:DeregisterTargets'
- 'elasticloadbalancing:Describe*'
- 'elasticloadbalancing:RegisterInstancesWithLoadBalancer'
- 'elasticloadbalancing:RegisterTargets'
Resource: '*'
# This is a role which is used by the ECS tasks themselves.
ECSTaskExecutionRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: [ecs-tasks.amazonaws.com]
Action: ['sts:AssumeRole']
Path: /
Policies:
- PolicyName: AmazonECSTaskExecutionRolePolicy
PolicyDocument:
Statement:
- Effect: Allow
Action:
# Allow the ECS Tasks to download images from ECR
- 'ecr:GetAuthorizationToken'
- 'ecr:BatchCheckLayerAvailability'
- 'ecr:GetDownloadUrlForLayer'
- 'ecr:BatchGetImage'
# Allow the ECS tasks to upload logs to CloudWatch
- 'logs:CreateLogStream'
- 'logs:PutLogEvents'
Resource: '*'
TaskDef:
Type: AWS::ECS::TaskDefinition
Properties:
Cpu: 4096
Memory: 30720
ContainerDefinitions:
- Image: !Ref ECRImageUrl
Name: !Sub "${ProjectName}-ecsContainer"
PortMappings:
- ContainerPort: 4000
HostPort: 4000
Protocol: tcp
Family: !Sub "${ProjectName}-taskDef"
ExecutionRoleArn: !Ref ECSTaskExecutionRole
RequiresCompatibilities:
- FARGATE
NetworkMode: awsvpc
Cluster:
Type: AWS::ECS::Cluster
Properties:
ClusterName: !Sub "${ProjectName}-ECSCluster"
Service:
Type: AWS::ECS::Service
DependsOn:
- LoadBalancerListener
Properties:
Cluster: !Ref Cluster
DesiredCount: 2
LaunchType: FARGATE
ServiceName: !Sub "${ProjectName}-ECSService"
TaskDefinition: !Ref TaskDef
NetworkConfiguration:
AwsvpcConfiguration:
SecurityGroups:
- !Ref FargateContainerSecurityGroup
AssignPublicIp: ENABLED
Subnets: !Split [',', {'Fn::ImportValue': !Sub '${VPCStackName}-PublicSubnets'}]
LoadBalancers:
- ContainerName: !Sub "${ProjectName}-ecsContainer"
ContainerPort: 4000
TargetGroupArn: !Ref TargetGroup
FargateContainerSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Access to the Fargate containers
VpcId:
Fn::ImportValue:
!Sub '${VPCStackName}-VPC'
EcsSecurityGroupIngressFromPublicALB:
Type: AWS::EC2::SecurityGroupIngress
Properties:
Description: Ingress from the public ALB
GroupId: !Ref 'FargateContainerSecurityGroup'
IpProtocol: -1
SourceSecurityGroupId: !Ref 'PublicLoadBalancerSG'
EcsSecurityGroupIngressFromSelf:
Type: AWS::EC2::SecurityGroupIngress
Properties:
Description: Ingress from other containers in the same security group
GroupId: !Ref 'FargateContainerSecurityGroup'
IpProtocol: -1
SourceSecurityGroupId: !Ref 'FargateContainerSecurityGroup'
PublicLoadBalancerSG:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Access to the public facing load balancer
VpcId:
Fn::ImportValue:
!Sub '${VPCStackName}-VPC'
SecurityGroupIngress:
- CidrIp: 0.0.0.0/0
IpProtocol: -1
ACMCertificate:
Type: AWS::CertificateManager::Certificate
Properties:
DomainName: !Sub ${ProjectName}.${DomainName}
ValidationMethod: DNS
TargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
DependsOn:
- LoadBalancer
Properties:
TargetType: ip
Name: !Sub "${ProjectName}-ECSService"
Port: 4000
Protocol: HTTP
VpcId:
Fn::ImportValue:
!Sub '${VPCStackName}-VPC'
LoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Scheme: internet-facing
Subnets: !Split [',', {'Fn::ImportValue': !Sub '${VPCStackName}-PublicSubnets'}]
SecurityGroups:
- !Ref PublicLoadBalancerSG
LoadBalancerListener:
Type: AWS::ElasticLoadBalancingV2::Listener
DependsOn:
- LoadBalancer
Properties:
DefaultActions:
- TargetGroupArn: !Ref TargetGroup
Type: 'forward'
LoadBalancerArn: !Ref LoadBalancer
Port: 443
Protocol: HTTP
健康检查功能自动在端口 80 调用 / 并期望 200 状态代码作为响应。它在 EC2->target groups -> your ecs target group 中可用。你必须确保你的端口是 4000 并且在健康检查中调整默认路径和响应状态代码。
此外,您始终可以尝试在您正在使用的端口 4000 上使用 public ip 或 DNS 连接到您的 ec2 实例,看看是否可行。
如果 ec2 实例无法在端口 4000 上运行,请对 docker 部署进行故障排除。会谈定义或参数有问题。
如果目标组 trlargets 或健康检查配置出现问题。
希望这对您有所帮助。
事实证明,我的安全组不够宽松。来自网络负载均衡器的流量被视为来自其原始来源,因此如果您的 NLB 对所有流量开放,那么您的 Fargate 容器也应该如此。这解决了我的问题:
FargateContainerSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Access to the Fargate containers
VpcId:
Fn::ImportValue:
!Sub '${VPCStackName}-VPC'
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: !Ref ApplicationPort
ToPort: !Ref ApplicationPort
CidrIp: 0.0.0.0/0
在经历了很多痛苦之后,我发现 ALB 本身需要与安全组 (SG) 相关联,该安全组允许 ECS 动态分配的端口上的流量。您应该自动定义一个 SG 来定义这些端口范围。将此 SG 与您的 ALB 相关联,您的健康检查将开始通过(假设其他一切都正确连接)。
此外,确保您的任务定义将网络模式设置为“网桥”,并将“hostPort”值设置为 0——这指示 ECS 在底层 EC2 实例上动态分配端口并映射它到你的集装箱港口。