如何通过 SSH 向 X 数量的 EC2 实例发送命令

How can I send a command to X number of EC2 instances via SSH

我有很多 AWS EC2 实例,我需要同时从它们执行一个 python 脚本。

我一直在尝试从我的电脑通过 ssh 发送所需的命令来执行脚本。为此,我创建了另一个 python 脚本,它打开一个 cmd 终端,然后执行一些命令(我需要在每个实例上执行 python 脚本的命令)。因为我需要同时打开所有这些 cmd 终端,所以我使用了 ThreatPoolExecutor(根据我的 PC 特性)允许我并行运行 60 次。这是代码:

import os
from concurrent.futures import ThreadPoolExecutor

ipAddressesList=list(open("hosts.txt").read().splitlines())

def functionMain(threadID):
    os.system(r'start cmd ssh -o StrictHostKeyChecking=no -i mysshkey.pem ec2-user@'+ipAddressesList[threadID]+' "cd scripts && python3.7 script.py"')

functionMainList =list(range(0,len(ipAddressesList)))

with ThreadPoolExecutor() as executor:

    results = executor.map(functionMain, functionMainList)

这个问题是执行 script.py 的命令阻塞终端直到进程结束,因此 functionMain 一直在等待结果。我想找到发送命令 python3.7 script.py 后函数结束但脚本继续在实例中执行的方式。所以池执行器可以继续线程。

请原谅我没有提供“代码”答案,但我相信现有的工具已经可以解决这个问题。这听起来像是 ClusterShell:

的理想用法

ClusterShell provides a light and unified command execution Python framework to help administer GNU/Linux or BSD clusters. Some of the most important benefits of using ClusterShell are to:

  • provide an efficient, parallel and highly scalable command execution engine in Python,

使用clush,您可以跨多个节点并行执行命令。它还具有按主机名对输出进行分组的选项。

另一种选择是使用 Ansible, but you'll need to create a playbook in that case whereas with ClusterShell you are running a command the same way you would with SSH. With Ansible, you will create a target group for a playbook and it will connect up to each instance and tell it to run the playbook. To make it disconnect while the command is still running, look into asynchronous 操作:

By default Ansible runs tasks synchronously, holding the connection to the remote node open until the action is completed. This means within a playbook, each task blocks the next task by default, meaning subsequent tasks will not run until the current task completes. This behavior can create challenges. For example, a task may take longer to complete than the SSH session allows for, causing a timeout. Or you may want a long-running process to execute in the background while you perform other tasks concurrently. Asynchronous mode lets you control how long-running tasks execute.

我已经在拥有 5,000 多台机器的 HPC 环境中使用了这两种方法,它们都能很好地满足您的需求。

AWS Systems Manager Run Command 可用于 运行 多个 Amazon EC2 实例(如果安装了 Systems Manager 代理,甚至是本地计算机)上的脚本。

运行 命令还可以在每个实例上提供命令 运行 的返回结果。

这绝对是通过 SSH 连接到实例到 运行 命令的首选。