AWS ECS Fargate 任务陷入重启循环 - 如何使用 Pulumi 配置负载均衡器目标组健康检查?

AWS ECS Fargate Tasks caught in restart-loop - How to configure LoadBalancer TargetGroup health checks with Pulumi?

我创建了一个简单的示例 Pulumi TypeScript 程序,它应该将 Spring 启动应用程序部署到 AWS ECS Fargate 集群中。在 Cloud Native Buildpacks/Paketo.io 的帮助下,Spring 启动应用 containerized/Dockerized 并发布到位于 ghcr.io/jonashackt/microservice-api-spring-boot 的 GitHub Container Registry(example project here ).

我已经通读了一些 Pulumi 教程并从通常的 pulumi new aws-typescript 开始。我现在有以下 index.ts:

import * as awsx from "@pulumi/awsx";

// Create a load balancer to listen for requests and route them to the container.
let loadbalancer = new awsx.lb.ApplicationListener("alb", { port: 8098, protocol: "HTTP" });

// Define Container image published to the GitHub Container Registry
let service = new awsx.ecs.FargateService("microservice-api-spring-boot", {
    taskDefinitionArgs: {
        containers: {
          microservice_api_spring_boot: {
                image: "ghcr.io/jonashackt/microservice-api-spring-boot:latest",
                memory: 768,
                portMappings: [ loadbalancer ],
            },
        },
    },
    desiredCount: 2,
});

// Export the URL so we can easily access it.
export const apiUrl = loadbalancer.endpoint.hostname;

选择 dev 堆栈后,正常的 pulumi up 运行并为我提供 ApplicationLoadBalancer URL。这也是我准备展示一切运行顺利的asciicast:

我现在的问题是 Fargate 服务不断停止和启动。我查看了 CloudWatch 日志,我看到 Spring 启动应用程序开始 - 几秒钟后蜂鸣声再次停止。我已经检查了 ApplicationLoadBalancer 的 TargetGroup,我看到 Registered Targets 一次又一次地变成 unhealthy。我该如何解决?

默认的 AWS TargetGroup HealthCheckPath 就是 /(参见 the docs)。作为标准的 Spring 启动应用程序通常会以 HTTP 404 响应,如下所示:

TargetGroups 内的 ApplicationLoadBalancers 健康检查 Status 转到 unhealthy,从而触发 Fargate 服务的重启。

我们如何解决这个问题?在 Spring 引导中,您通常会使用 spring-boot-actuator。将它添加到您的 pom.xml 应用程序响应 localhost:yourPort/actuator/health:

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-actuator</artifactId>
    </dependency>

所以我们需要配置Pulumi创建的TargetGroup使用健康检查路径/actuator/health而不是/

Pulumi 文档告诉我们如何 Manually Configure Target Groups, but how could these exactly be integrated into the TypeScript code? The answer is hidden inside the @pulumi/awsx/lb docs! Pulumi 教程中的示例代码在一行 let loadbalancer = new awsx.lb.ApplicationListener("alb", { port: 8098, protocol: "HTTP" });:

中完成了多项操作
  1. 它创建一个 ApplicationLoadBalancer
  2. 它创建一个匹配的 TargetGroup
  3. 它创建一个匹配的 ApplicationListener

我们只需要手动创建每个组件,因为这样我们就可以配置目标组的 healthCheck: path 属性:

import * as awsx from "@pulumi/awsx";

// Spring Boot Apps port
const port = 8098;

// Create a ApplicationLoadBalancer to listen for requests and route them to the container.
const alb = new awsx.lb.ApplicationLoadBalancer("fargateAlb");

// Create TargetGroup & Listener manually (see https://www.pulumi.com/docs/reference/pkg/nodejs/pulumi/awsx/lb/)
// so that we can configure the TargetGroup HealthCheck as described in (https://www.pulumi.com/docs/guides/crosswalk/aws/elb/#manually-configuring-target-groups)
// otherwise our Spring Boot Containers will be restarted every time, since the TargetGroup HealthChecks Status always
// goes to unhealthy
const albTargetGroup = alb.createTargetGroup("fargateAlbTargetGroup", {
    port: port,
    protocol: "HTTP",
    healthCheck: {
        // Use the default spring-boot-actuator health endpoint
        path: "/actuator/health"
    }
});

const albListener = albTargetGroup.createListener("fargateAlbListener", { port: port, protocol: "HTTP" });

// Define Container image published to the GitHub Container Registry
const service = new awsx.ecs.FargateService("microservice-api-spring-boot", {
    taskDefinitionArgs: {
        containers: {
            microservice_api_spring_boot: {
                image: "ghcr.io/jonashackt/microservice-api-spring-boot:latest",
                memory: 768,
                portMappings: [ albListener ]
            },
        },
    },
    desiredCount: 2,
});

// Export the URL so we can easily access it.
export const apiUrl = albListener.endpoint.hostname;

现在使用此配置,我们的 Fargate 服务在启动后应该会恢复正常。我们应该能够在 AWS 控制台的 ALB 的 TargetGroup 中看到这个: