"Network bindings - not configured" 当 运行 使用 AWS Fargate 服务时

"Network bindings - not configured" when running service with AWS Fargate

我正在尝试使用通过 Terraform 提供的 ECS Fargate 设置几个服务。它们使用相同的模块,只有图像、ALB 目标组、环境变量和端口映射不同。

3 个服务中有 2 个成功启动了它们的任务,只有一个(不幸的是主服务)不想启动并显示 Network bindings - not configured 容器。我使用的端口是 80.

任务定义具有正确的端口映射。

我试过更改端口(到 8080)、使用多个端口映射并多次重新创建服务都没有效果。

当然,任务会因健康检查失败而被负载均衡器终止。

任何指示可能是什么错误?我在 2017 年发现了一些 Github 与此相关的问题,但在 EC2 支持的 ECS 实例上,据称已修复。

作为参考,这里是任务定义JSON:

{
  "ipcMode": null,
  "executionRoleArn": "ROLE_ARN",
  "containerDefinitions": [
    {
      "dnsSearchDomains": null,
      "logConfiguration": {
        "logDriver": "awslogs",
        "secretOptions": null,
        "options": {
          "awslogs-group": "/drone",
          "awslogs-region": "eu-central-1",
          "awslogs-stream-prefix": "drone-server/"
        }
      },
      "entryPoint": null,
      "portMappings": [
        {
          "hostPort": 80,
          "protocol": "tcp",
          "containerPort": 80
        }
      ],
      "command": null,
      "linuxParameters": null,
      "cpu": 256,
      "environment": [...],
      "resourceRequirements": null,
      "ulimits": null,
      "dnsServers": null,
      "mountPoints": [],
      "workingDirectory": null,
      "secrets": [...],
      "dockerSecurityOptions": null,
      "memory": 512,
      "memoryReservation": 512,
      "volumesFrom": [],
      "stopTimeout": 30,
      "image": "drone/drone:1",
      "startTimeout": null,
      "dependsOn": null,
      "disableNetworking": null,
      "interactive": null,
      "healthCheck": null,
      "essential": true,
      "links": null,
      "hostname": null,
      "extraHosts": null,
      "pseudoTerminal": null,
      "user": null,
      "readonlyRootFilesystem": false,
      "dockerLabels": null,
      "systemControls": null,
      "privileged": null,
      "name": "drone-server"
    }
  ],
  "placementConstraints": [],
  "memory": "512",
  "taskRoleArn": "ROLE_ARN",
  "compatibilities": [
    "EC2",
    "FARGATE"
  ],
  "taskDefinitionArn": "TASK_DEFINITION_ARN",
  "family": "drone-server",
  "requiresAttributes": [
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "ecs.capability.execution-role-awslogs"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.docker-remote-api.1.21"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.task-iam-role"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "ecs.capability.container-ordering"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "ecs.capability.secrets.ssm.environment-variables"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "ecs.capability.task-eni"
    }
  ],
  "pidMode": null,
  "requiresCompatibilities": [
    "FARGATE"
  ],
  "networkMode": "awsvpc",
  "cpu": "256",
  "revision": 14,
  "status": "ACTIVE",
  "proxyConfiguration": null,
  "volumes": []
}```

使用 EC2 上的 ECS,您的容器端口(如 80)被映射到主机上的动态端口(如 35467),然后将此端口注册到类型 'instance' 的 TargetGroup。 (从技术上讲,如果您发送一个零作为映射到容器上端口 80 的主机端口,就会发生这种情况。AWS 将其视为 'dynamically assign a port on the host')

Fargate 的最大区别在于它使用附加到任务的 ENI 进行网络连接,并且每个任务都有自己的私有 IP 地址(如果需要,可以是 public嗯)。

然后,使用那个唯一的 IP 地址(而不是实例唯一的端口)它 将端口 80 的唯一 IP 地址 注册到类型为 'ip'.

的 TargetGroup

所以有两件事可能会出错...首先,在 Fargate 上,您的任务必须具有相同的主机端口和容器端口(例如 80:80),并且您必须确保它正在注册到类型为 'ip'.

的 TargetGroup

我不是 terraform 用户,所以不确定其中有多少是在你的控制之下,但我怀疑这两件事中的一件是不对的,导致你的网络 service/task 无法正确启动。

作为参考,这里是任务定义JSON:

{
  "ipcMode": null,
  "executionRoleArn": "ROLE_ARN",
  "containerDefinitions": [
    {
      "dnsSearchDomains": null,
      "logConfiguration": {
        "logDriver": "awslogs",
        "secretOptions": null,
        "options": {
          "awslogs-group": "/drone",
          "awslogs-region": "eu-central-1",
          "awslogs-stream-prefix": "drone-server/"
        }
      },
      "entryPoint": null,
      "portMappings": [
        {
          "hostPort": 80,
          "protocol": "tcp",
          "containerPort": 80
        }
      ],
      "command": null,
      "linuxParameters": null,
      "cpu": 256,
      "environment": [...],
      "resourceRequirements": null,
      "ulimits": null,
      "dnsServers": null,
      "mountPoints": [],
      "workingDirectory": null,
      "secrets": [...],
      "dockerSecurityOptions": null,
      "memory": 512,
      "memoryReservation": 512,
      "volumesFrom": [],
      "stopTimeout": 30,
      "image": "drone/drone:1",
      "startTimeout": null,
      "dependsOn": null,
      "disableNetworking": null,
      "interactive": null,
      "healthCheck": null,
      "essential": true,
      "links": null,
      "hostname": null,
      "extraHosts": null,
      "pseudoTerminal": null,
      "user": null,
      "readonlyRootFilesystem": false,
      "dockerLabels": null,
      "systemControls": null,
      "privileged": null,
      "name": "drone-server"
    }
  ],
  "placementConstraints": [],
  "memory": "512",
  "taskRoleArn": "ROLE_ARN",
  "compatibilities": [
    "EC2",
    "FARGATE"
  ],
  "taskDefinitionArn": "TASK_DEFINITION_ARN",
  "family": "drone-server",
  "requiresAttributes": [
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "ecs.capability.execution-role-awslogs"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.docker-remote-api.1.21"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.task-iam-role"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "ecs.capability.container-ordering"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "ecs.capability.secrets.ssm.environment-variables"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
    },
    {
      "targetId": null,
      "targetType": null,
      "value": null,
      "name": "ecs.capability.task-eni"
    }
  ],
  "pidMode": null,
  "requiresCompatibilities": [
    "FARGATE"
  ],
  "networkMode": "awsvpc",
  "cpu": "256",
  "revision": 14,
  "status": "ACTIVE",
  "proxyConfiguration": null,
  "volumes": []
}```

显然 Fargate 不太擅长报告错误或显示状态。它不会在 AWS 控制台中显示所有环境变量或正确状态,但无论如何都能正常工作。

这个故事的寓意是,如果控制台中没有显示某些内容,请确保测试它是否真的不起作用。

老实说,我无法解决我的问题,因为当我通过环境变量在 Drone CI 服务器上打开跟踪日志记录时,它就消失了。