如何捕获子进程的输入和输出?

How to capture inputs and outputs of a child process?

我正在尝试制作一个程序,该程序将可执行文件名称作为参数,运行s 可执行文件并报告该 运行 的输入和输出。例如考虑一个名为“circle”的子程序。我的程序需要 运行 以下内容:

$ python3 capture_io.py ./circle
Enter radius of circle: 10
Area: 314.158997
[('output', 'Enter radius of circle: '), ('input',  '10\n'), ('output', 'Area: 314.158997\n')]

我决定使用 pexpect module for this job. It has a method called interact,它可以让用户与子程序进行交互,如上所示。它还需要 2 个可选参数:output_filterinput_filter。来自文档:

The output_filter will be passed all the output from the child process. The input_filter will be passed all the keyboard input from the user.

所以这是我写的代码:

capture_io.py

import sys
import pexpect

_stdios = []


def read(data):
    _stdios.append(("output", data.decode("utf8")))
    return data


def write(data):
    _stdios.append(("input", data.decode("utf8")))
    return data


def capture_io(argv):
    _stdios.clear()
    child = pexpect.spawn(argv)
    child.interact(input_filter=write, output_filter=read)
    child.wait()
    return _stdios


if __name__ == '__main__':
    stdios_of_child = capture_io(sys.argv[1:])
    print(stdios_of_child)

circle.c

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char* argv[]) {
    float radius, area;

    printf("Enter radius of circle: ");
    scanf("%f", &radius);

    if (radius < 0) {
        fprintf(stderr, "Negative radius values are not allowed.\n");
        exit(1);
    }

    area = 3.14159 * radius * radius;
    printf("Area: %f\n", area);
    return 0;
}

产生以下输出:

$ python3 capture_io.py ./circle
Enter radius of circle: 10
Area: 314.158997
[('output', 'Enter radius of circle: '), ('input', '1'), ('output', '1'), ('input', '0'), ('output', '0'), ('input', '\r'), ('output', '\r\n'), ('output', 'Area: 314.158997\r\n')]

正如您从输出中看到的那样,输入是逐个字符处理的,并且还作为输出回显,这造成了这样的混乱。是否可以更改此行为,以便我的 input_filter 仅在按下 Enter 时才会 运行?

或者更一般地说,实现我的目标的最佳方式是什么(有或没有 pexpect)?

我认为您无法轻松做到这一点,但是,我认为这对您有用:

output_buffer=''
def read(data):
    output_buffer+=data
    if data == '\r':
         _stdios.append(("output", output_buffer.decode("utf8")))
         output_buffer = ''
    return data

当我开始写一个帮助程序时,我意识到主要问题是输入应该被记录行缓冲,所以退格键和其他编辑是在输入到达程序之前完成的,但输出应该是无缓冲的为了记录未被换行终止的提示。

为了记录目的捕获输出,需要一个管道,但它会自动打开行缓冲。众所周知,伪终端解决了这个问题(expect 模块是围绕伪终端构建的),但是终端同时具有输入和输出,我们只想取消缓冲输出。

幸好有 stdbuf 实用程序。在 Linux 上,它改变了动态链接可执行文件的 C 库函数。不能普遍使用。

我修改了一个Python双向复制程序来记录它复制的数据。结合 stdbuf 它会产生所需的输出。

import select
import os

STDIN = 0
STDOUT = 1

BUFSIZE = 4096

def main(cmd):
    ipipe_r, ipipe_w = os.pipe()
    opipe_r, opipe_w = os.pipe()
    if os.fork():
        # parent
        os.close(ipipe_r)
        os.close(opipe_w)
        fdlist_r = [STDIN, opipe_r]
        while True:
            ready_r, _, _ = select.select(fdlist_r, [], []) 
            if STDIN in ready_r:
                # STDIN -> program
                data = os.read(STDIN, BUFSIZE)
                if data:
                    yield('in', data)   # optional: convert to str
                    os.write(ipipe_w, data)
                else:
                    # send EOF
                    fdlist_r.remove(STDIN)
                    os.close(ipipe_w)
            if opipe_r in ready_r:
                # program -> STDOUT
                data = os.read(opipe_r, BUFSIZE)
                if not data:
                    # got EOF
                    break
                yield('out', data)
                os.write(STDOUT, data)
        os.wait()
    else:
        # child
        os.close(ipipe_w)
        os.close(opipe_r)
        os.dup2(ipipe_r, STDIN)
        os.dup2(opipe_w, STDOUT)
        os.execlp(*cmd)
        # not reached
        os._exit(127)

if __name__ == '__main__':
    log = list(main(['stdbuf', 'stdbuf', '-o0', './circle']))
    print(log)

它打印:

[('out', b'Enter radius of circle: '), ('in', b'12\n'), ('out', b'Area: 452.388947\n')]

Is it possible to change this behaviour so that my input_filter will run only when Enter is pressed?

,继承pexpect.spawn,覆盖interact方法即可。我很快就会谈到这一点。

正如 VPfB 在 , you can't use a pipe and I think it's worth to mentioning that this issue is also addressed in the pexpect's documentation 中指出的那样。

你说的是:

... input is processed character by character and also echoed back as output ...

如果您检查 interact 的源代码,您可以看到这一行:

tty.setraw(self.STDIN_FILENO)

这会将您的终端设置为 raw mode:

input is available character by character, ..., and all special processing of terminal input and output characters is disabled.

这就是为什么您的 input_filter 函数在每次按键时都会 运行ning 并且它会看到退格键或其他特殊字符。如果你可以注释掉这一行,当你 运行 你的程序时你会看到这样的东西:

$ python3 test.py ./circle
Enter radius of circle: 10
10
Area: 314.158997
[('output', 'Enter radius of circle: '), ('input', '10\n'), ('output', '10\r\n'), ('output', 'Area: 314.158997\r\n')]

这也可以让您编辑输入(即 12[Backspace]0 会给您相同的结果)。但如您所见,它仍然会回应输入。这可以通过为 child 的终端设置一个简单的标志来禁用:

mode = tty.tcgetattr(self.child_fd)
mode[3] &= ~termios.ECHO
tty.tcsetattr(self.child_fd, termios.TCSANOW, mode)

运行最新变化:

$ python3 test.py ./circle
Enter radius of circle: 10
Area: 314.158997
[('output', 'Enter radius of circle: '), ('input', '10\n'), ('output', 'Area: 314.158997\r\n')]

宾果!现在您可以继承 pexpect.spawn 并使用这些更改覆盖 interact 方法,或者使用 Python:

的内置 pty 模块实现相同的功能 pty:
import os
import pty
import sys
import termios
import tty

_stdios = []

def _read(fd):
    data = os.read(fd, 1024)
    _stdios.append(("output", data.decode("utf8")))
    return data


def _stdin_read(fd):
    data = os.read(fd, 1024)
    _stdios.append(("input", data.decode("utf8")))
    return data


def _spawn(argv):
    pid, master_fd = pty.fork()
    if pid == pty.CHILD:
        os.execlp(argv[0], *argv)

    mode = tty.tcgetattr(master_fd)
    mode[3] &= ~termios.ECHO
    tty.tcsetattr(master_fd, termios.TCSANOW, mode)

    try:
        pty._copy(master_fd, _read, _stdin_read)
    except OSError:
        pass

    os.close(master_fd)
    return os.waitpid(pid, 0)[1]


def capture_io_and_return_code(argv):
    _stdios.clear()
    return_code = _spawn(argv)
    return _stdios, return_code >> 8


if __name__ == '__main__':
    stdios, ret = capture_io_and_return_code(sys.argv[1:])
    print(stdios)

pexpect:

import sys
import termios
import tty
import pexpect

_stdios = []


def read(data):
    _stdios.append(("output", data.decode("utf8")))
    return data


def write(data):
    _stdios.append(("input", data.decode("utf8")))
    return data


class CustomSpawn(pexpect.spawn):
    def interact(self, escape_character=chr(29),
                 input_filter=None, output_filter=None):
        self.write_to_stdout(self.buffer)
        self.stdout.flush()
        self._buffer = self.buffer_type()
        mode = tty.tcgetattr(self.child_fd)
        mode[3] &= ~termios.ECHO
        tty.tcsetattr(self.child_fd, termios.TCSANOW, mode)
        if escape_character is not None and pexpect.PY3:
            escape_character = escape_character.encode('latin-1')
        self._spawn__interact_copy(escape_character, input_filter, output_filter)


def capture_io_and_return_code(argv):
    _stdios.clear()
    child = CustomSpawn(argv)
    child.interact(input_filter=write, output_filter=read)
    child.wait()
    return _stdios, child.status >> 8


if __name__ == '__main__':
    stdios, ret = capture_io_and_return_code(sys.argv[1:])
    print(stdios)