如何以编程方式停止 AWS Fargate 容器？

How to stop AWS Fargate container programmatically?

问题：我们有一个预定的作业（AWS Fargate + Lambda + DynamoDB）这将运行每天。但是由于一些问题，如果容器卡住了，我们需要停止容器并再次重新触发相同的作业。

我们有什么方法可以 stop/timeout AWS Fargate 任务吗？

您可以检查任务的状态，使用 ECS cli 停止并重新启动它们，并为此创建一个脚本。

您可以在控制台或通过 AWS CLI 执行此操作。您可能想使用 stop-task API 来做到这一点。这需要您首先获取列出集群中任务的任务 ID，然后根据您需要过滤的内容（我猜是任务定义？）过滤它们。

您不需要构建额外的逻辑。您可以利用内置的 HEALTHCHECK 功能。您可以在 official documentation 页面

阅读更多信息

创建任务定义时，在容器部分，您将看到高级容器配置您可以在此处为 HEALTHCHECK 指定以下属性：

Attribute	Description	Default Value
Command	The healthcheck command, it can be one-liner command or a shell script call	None
Interval	The time period in seconds between each health check execution between 5 and 300 seconds.	30 seconds
Timeout	The time period in seconds to wait for a health check to succeed before it is considered a failure between 2 and 60 seconds	5 seconds
Start Period	waiting time to before doing health check(container booting time) . It can be 0-300seconds	Disabled
Retries	The number of times to retry a failed health check before the container is considered unhealthy between 1 and 10 retries	3 retries

因此在使用正确的参数指定健康检查一个线性命令或脚本后，健康检查将持续运行。根据您指定的 HealthCheck 命令的结果，每次检查执行后您有 2 个结果

Condition	HealthCheck Status
An exit code of 0	Success
A non-zero exit code	Failure

所以连续检查container fill all failure retries之后，就判定为unhealthy，然后自动停止并替换。您不需要额外的控制器、检查器、调谐器！只需编写一个简单的健康检查，它真正表明您的容器是健康的。

对方条件要注意；