subprocess.check_output()、zgrep 和匹配限制
subprocess.check_output(), zgrep, and match limit
上下文:我正在尝试查找 python 包的 github 存储库。为此,我正在 zgrep'ping github url 的包存档。它工作正常,直到我将输出限制为 1 个结果:
# works, returns a lot of results
subprocess.check_output(["zgrep", "-oha", "github", 'Django-1.10.1.tgz']) # works, a lot of results
# add -m1 to limit output, returns status 2 (doesn't work)
subprocess.check_output(["zgrep", "-m1", "-oha", "github", 'Django-1.10.1.tgz']) # works, a lot of results
# same command, different file - works
subprocess.check_output(["zgrep", "-m1", "-oha", "github", 'grabber.py'])
在命令行中,所有三个命令都可以正常工作。有什么想法吗?
回溯:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/subprocess.py", line 574, in check_output
raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['zgrep', '-m1', '-oha', 'github', 'pkgs/Django-1.10.1.tar.gz']' returned non-zero exit status 2
命令行:
$ zgrep -m1 -oha "github.com/[^/]\+/django" pkgs/Django-1.10.1.tar.gz
github.com/django/django
所以,原因是:zgrep 是一个 shell 脚本,它只是通过 gzip 和 egrep 传输存档。如果我们限制结果的数量,egrep 终止管道,因此 gzip 退出并抱怨。在控制台中我们从来没有看到它,但是子进程以某种方式捕获了这个信号并引发了异常。
解决方案:编写不会报错的迷你版 zgrep
gunzip < $FILE 2> /dev/null | egrep -m1 -ohia $PATTERN
上下文:我正在尝试查找 python 包的 github 存储库。为此,我正在 zgrep'ping github url 的包存档。它工作正常,直到我将输出限制为 1 个结果:
# works, returns a lot of results
subprocess.check_output(["zgrep", "-oha", "github", 'Django-1.10.1.tgz']) # works, a lot of results
# add -m1 to limit output, returns status 2 (doesn't work)
subprocess.check_output(["zgrep", "-m1", "-oha", "github", 'Django-1.10.1.tgz']) # works, a lot of results
# same command, different file - works
subprocess.check_output(["zgrep", "-m1", "-oha", "github", 'grabber.py'])
在命令行中,所有三个命令都可以正常工作。有什么想法吗?
回溯:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/subprocess.py", line 574, in check_output
raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['zgrep', '-m1', '-oha', 'github', 'pkgs/Django-1.10.1.tar.gz']' returned non-zero exit status 2
命令行:
$ zgrep -m1 -oha "github.com/[^/]\+/django" pkgs/Django-1.10.1.tar.gz
github.com/django/django
所以,原因是:zgrep 是一个 shell 脚本,它只是通过 gzip 和 egrep 传输存档。如果我们限制结果的数量,egrep 终止管道,因此 gzip 退出并抱怨。在控制台中我们从来没有看到它,但是子进程以某种方式捕获了这个信号并引发了异常。
解决方案:编写不会报错的迷你版 zgrep
gunzip < $FILE 2> /dev/null | egrep -m1 -ohia $PATTERN