pexpect 如何分析 child 的标准输出?
How pexpect analyzes stdout of the child?
有如下代码:
child = pexpect.spawn("prog")
#some delay...
child.expect(Name .*: )
child.sendline('anonymous')
当 child 进程启动后,它可以开始向其标准输出发送大量数据,例如日志信息。这是否意味着 pexpect 开始查找所有 child 的标准输出(从进程开始到当前时刻)?或者 pexpect 在 expect
调用后才开始做?
我的 child 进程生成了大量日志信息。 CPU 的速度非常慢。我想这种预期的实现可能是原因
After a child process is spawned, the child will write()
its data to the pty (slave side) and waiting parent to read()
来自 pty(主控方)的数据。如果没有 child.expect()
,child 的 write()
可能会在输出过多数据时被阻塞,因为写入缓冲区已满。
当child.expect()
匹配一个模式时它会return然后你必须再次调用child.expect()
否则child输出太多后可能仍然被阻塞数据。
参见以下示例:
# python
>>> import pexpect
>>> ch = pexpect.spawn('find /')
>>> ch
<pexpect.pty_spawn.spawn object at 0x7f47390bae90>
>>>
此时 find
已经生成,并且已经输出了一些数据。但是我没有调用 ch.expect()
所以 find
现在被阻塞(休眠)并且它不消耗 CPU.
# ps -C find u
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 100831 0.0 0.2 9188 2348 pts/12 Ss+ 10:23 0:00 /usr/bin/find /
# strace -p 100831
Process 100831 attached
write(1, "\n", 1 <-- The write() is being blocked
这里的 STAT S
表示 sleeping(s
表示 session leader, +
表示前台进程).
根据pexpect的文档,spawn()的两个选项可能会影响性能:
The maxread
attribute sets the read buffer size. This is maximum number of bytes that Pexpect will try to
read from a TTY at one time. Setting the maxread
size to 1
will turn off buffering. Setting the maxread
value higher may help performance in cases where large amounts of output are read back from the child.
This feature is useful in conjunction with searchwindowsize
.
When the keyword argument searchwindowsize
is None
(default), the full buffer is searched at each iteration of receiving incoming data. The default number of bytes scanned at each iteration is very large
and may be reduced to collaterally reduce search cost. After expect()
returns, the full buffer attribute
remains up to size maxread
irrespective of searchwindowsize
value.
有如下代码:
child = pexpect.spawn("prog")
#some delay...
child.expect(Name .*: )
child.sendline('anonymous')
当 child 进程启动后,它可以开始向其标准输出发送大量数据,例如日志信息。这是否意味着 pexpect 开始查找所有 child 的标准输出(从进程开始到当前时刻)?或者 pexpect 在 expect
调用后才开始做?
我的 child 进程生成了大量日志信息。 CPU 的速度非常慢。我想这种预期的实现可能是原因
After a child process is spawned, the child will write()
its data to the pty (slave side) and waiting parent to read()
来自 pty(主控方)的数据。如果没有 child.expect()
,child 的 write()
可能会在输出过多数据时被阻塞,因为写入缓冲区已满。
当child.expect()
匹配一个模式时它会return然后你必须再次调用child.expect()
否则child输出太多后可能仍然被阻塞数据。
参见以下示例:
# python
>>> import pexpect
>>> ch = pexpect.spawn('find /')
>>> ch
<pexpect.pty_spawn.spawn object at 0x7f47390bae90>
>>>
此时 find
已经生成,并且已经输出了一些数据。但是我没有调用 ch.expect()
所以 find
现在被阻塞(休眠)并且它不消耗 CPU.
# ps -C find u
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 100831 0.0 0.2 9188 2348 pts/12 Ss+ 10:23 0:00 /usr/bin/find /
# strace -p 100831
Process 100831 attached
write(1, "\n", 1 <-- The write() is being blocked
这里的 STAT S
表示 sleeping(s
表示 session leader, +
表示前台进程).
根据pexpect的文档,spawn()的两个选项可能会影响性能:
The
maxread
attribute sets the read buffer size. This is maximum number of bytes that Pexpect will try to read from a TTY at one time. Setting themaxread
size to1
will turn off buffering. Setting themaxread
value higher may help performance in cases where large amounts of output are read back from the child. This feature is useful in conjunction withsearchwindowsize
.When the keyword argument
searchwindowsize
isNone
(default), the full buffer is searched at each iteration of receiving incoming data. The default number of bytes scanned at each iteration is very large and may be reduced to collaterally reduce search cost. Afterexpect()
returns, the full buffer attribute remains up to sizemaxread
irrespective ofsearchwindowsize
value.