subprocess.communicate() 仅在脚本运行时神秘挂起

Question

我正在从 Bash 脚本中调用名为 spark-ec2 的 Python 工具。

作为其工作的一部分，spark-ec2 通过使用 subprocess 模块多次调用系统的 ssh 命令。

s = subprocess.Popen(
    ssh_command(opts) + ['-t', '-t', '-o', 'ConnectTimeout=3',
                         '%s@%s' % (opts.user, host), stringify_command('true')],
    stdout=subprocess.PIPE,
    stderr=subprocess.STDOUT  # we pipe stderr through stdout to preserve output order
)
cmd_output = s.communicate()[0]  # [1] is stderr, which we redirected to stdout

出于某种原因，spark-ec2 挂在调用 communicate() 的那条线上。我不知道为什么。

为了记录在案，这里是一段显示我如何调用 spark-ec2 的摘录：

# excerpt from script-that-calls-spark-ec2.sh

# snipped: load AWS keys and do other setup stuff

timeout 30m spark-ec2 launch "$CLUSTER_NAME" ...

# snipped: if timeout, report and exit

让我感到难受的是，当我单独调用 spark-ec2 时它工作正常，当我从这个 Bash 脚本复制和粘贴命令并且运行它们交互工作时它们工作正常。

只有当我像这样执行整个脚本时才会出现

$ ./script-that-calls-spark-ec2.sh

spark-ec2 挂在那个 communicate() 步骤上。这让我抓狂。

怎么回事？

Answer 1

这是其中一件事，一旦我想通了，就让我大声说 "Wow..."，既惊讶又厌恶。

在这种情况下，spark-ec2 不会因为与使用 subprocess.PIPE 有关的一些死锁而挂起，如果 spark-ec2 使用 Popen.wait() instead of Popen.communicate() 可能会出现这种情况。

正如 spark-ec2 仅在一次调用整个 Bash 脚本时挂起这一事实所暗示的那样，问题是由某些行为引起的，具体取决于是否被调用交互与否。

在这种情况下，罪魁祸首是 GNU coreutils 实用程序 timeout，它提供的选项称为 --foreground。

来自 timeout 手册页：

   --foreground

          when not running timeout directly from a shell prompt,

          allow  COMMAND  to  read  from  the TTY and get TTY signals; in this
          mode, children of COMMAND will not be timed out

如果没有此选项，Python 的 communicate() 将无法读取 subprocess.Popen() 调用的 SSH 命令的输出。

这可能与 SSH 通过 -t 开关分配 TTY 有关，但老实说我并不完全理解。

不过，我可以说的是修改Bash脚本以使用--foreground选项，像这样

timeout --foreground 30m spark-ec2 launch "$CLUSTER_NAME" ...

使一切按预期工作。

现在，如果我是你，我会考虑将那个 Bash 脚本转换成其他不会让你发疯的东西...

subprocess.communicate() 仅在脚本运行时神秘挂起

subprocess.communicate() mysteriously hangs only when run from a script

python

unix

bash

subprocess

subprocess.communicate() 仅在脚本 运行 时神秘挂起

subprocess.communicate() mysteriously hangs only when run from a script

python

unix

bash

subprocess

subprocess.communicate() 仅在脚本运行时神秘挂起