Python subprocess communicate() 产生 None，当需要数字列表时

Question

当我运行下面的代码

from subprocess import call, check_output, Popen, PIPE

gr = Popen(["grep", "'^>'", myfile], stdout=PIPE)
sd = Popen(["sed", "s/.*len=//"], stdin=gr.stdout)
gr.stdout.close()
out = sd.communicate()[0]
print out

我的文件看起来像这样：

>name len=345
sometexthere
>name2 len=4523
someothertexthere
...
...

我明白了

None

当预期输出是数字列表时：

345
4523
...
...

我在终端运行对应的命令是

grep "^>" myfile | sed "s/.*len=//" > outfile

到目前为止，我已经尝试过以不同的方式进行转义和引用，例如在 sed 中转义斜杠或为 grep 添加额外的引号，但组合的可能性很大。

我也考虑过只读取文件并写入 Python grep 和 sed 的等价物，但文件非常大（尽管我总是可以逐行读取），它总是运行在基于 UNIX 的系统上，我仍然很好奇我在哪里犯了错误。

难道是

sd.communicate()[0]

returns 类型为 None 的某种对象（而不是整数列表）？

我知道在简单的情况下我可以使用 check_output 获取输出：

sam = check_output(["samn", "stats", myfile])

但不确定如何让它在更复杂的情况下工作，因为东西正在通过管道传输。

使用子流程获得预期结果的有效方法有哪些？

Answer 1

您需要在第二次 Popen 调用时重定向 stdout，否则输出将直接转到父进程标准输出，而 communicate 将 return None.

sd = Popen(["sed", "s/.*len=//"], stdin=gr.stdout, stdout=PIPE)

Answer 2

不要在 grep 行的 ^> 周围加上单引号。这不是 bash，因此所有参数都将按字面意思传递给底层程序。
您需要将 sd 的标准输出重定向到 PIPE。

Answer 3

根据建议，您需要在第二个过程中 stdout=PIPE 并从 "'^>'":

中删除单引号

gr = Popen(["grep", "^>", myfile], stdout=PIPE)
Popen(["sed", "s/.*len=//"], stdin=gr.stdout, stdout=PIPE)
......

但这可以简单地使用纯 python 和 re:

来完成

import re
r = re.compile("^\>.*len=(.*)$")
with open("test.txt") as f:
    for line in f:
        m =  r.search(line)
        if m:
            print(m.group(1))

这将输出：

345
4523

如果以 > 开头的行总是有数字并且数字总是在 len= 之后，那么您实际上也不需要正则表达式：

with open("test.txt") as f:
    for line in f:
        if line.startswith(">"):
            print(line.rsplit("len=", 1)[1])

Answer 4

Padraic Cunningham 的回答可以接受

如何在命令行字符串中应用单引号

use shlex

.

import shlex
from subprocess import call, check_output, Popen, PIPE
gr = Popen(shlex.split("grep '^>' my_file"), stdout=PIPE)
sd = Popen(["sed", "s/.*len=//"], stdin=gr.stdout,stdout=PIPE)
gr.stdout.close()
out = sd.communicate()[0]
print out

Python subprocess communicate() 产生 None，当需要数字列表时

Python subprocess communicate() yields None, when list of number is expected

python

subprocess