为什么 python 在关闭 fifo 文件时会产生 sigpipe 异常?
Why does python generate sigpipe exception on closing a fifo file?
TL;DR:为什么关闭接收到 SIGPIPE 异常的 fifo 文件(命名管道)会产生另一个 SIGPIPE 异常?
我的 python 脚本正在通过 FIFO 文件将字节写入另一个进程,该进程是我的 python 进程的子进程。 (有一些限制,我必须使用命名管道。)
我必须考虑子进程可能提前终止的事实。如果发生这种情况,我的 python 脚本必须回收死掉的子进程并重新启动它。
要查看子进程是否终止,我只是先尝试写入 FIFO,如果我收到 SIGPIPE 异常(实际上是 IOError,表示管道损坏),我知道是时候重新启动我的子进程了。
最小示例如下:
#!/usr/bin/env python3
import os
import signal
import subprocess
# The FIFO file.
os.mkfifo('tmp.fifo')
# A subprocess to simply discard any input from the FIFO.
FNULL = open(os.devnull, 'w')
proc = subprocess.Popen(['/bin/cat', 'tmp.fifo'], stdout=FNULL, stderr=FNULL)
print('pid = %d' % proc.pid)
# Open the FIFO, and MUST BE BINARY MODE.
fifo = open('tmp.fifo', 'wb')
# Endlessly write to the FIFO.
while True:
# Try to write to the FIFO, restart the subprocess on demand, until succeeded.
while True:
try:
# Optimistically write to the FIFO.
fifo.write(b'hello')
except IOError as e:
# The subprocess died. Close the FIFO and reap the subprocess.
fifo.close()
os.kill(proc.pid, signal.SIGKILL)
proc.wait()
# Start the subprocess again.
proc = subprocess.Popen(['/bin/cat', 'tmp.fifo'], stdout=FNULL, stderr=FNULL)
print('pid = %d' % proc.pid)
fifo = open('tmp.fifo', 'wb')
else:
# The write goes on well.
break
要重现结果,运行 该脚本并通过 kill -9 <pid>
手动终止子进程。追溯将表明
Traceback (most recent call last):
File "./test.py", line 24, in <module>
fifo.write(b'hello')
BrokenPipeError: [Errno 32] Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./test.py", line 27, in <module>
fifo.close()
BrokenPipeError: [Errno 32] Broken pipe
那么为什么关闭 FIFO 文件会产生另一个 SIGPIPE 异常?
我运行在以下平台测试,结果相同
Python 3.7.6 @ Darwin Kernel Version 19.3.0 (MacOS 10.15.3)
Python 3.6.8 @ Linux 4.18.0-147.3.1.el8_1.x86_64 (Centos 8)
这是因为Python不会在fifo.write
失败时清除写入缓冲区。所以buffer在执行fifo.close
的时候会再次写入broken pipe,从而导致第二个SIGPIPE
.
在strace
的帮助下找到了原因。这里有一些细节。
首先,修改一小部分Python代码,如下,
#!/usr/bin/env python3
import os
import signal
import subprocess
# The FIFO file.
os.mkfifo('tmp.fifo')
# A subprocess to simply discard any input from the FIFO.
FNULL = open(os.devnull, 'w')
proc = subprocess.Popen(['/bin/cat', 'tmp.fifo'], stdout=FNULL, stderr=FNULL)
print('pid = %d' % proc.pid)
# Open the FIFO, and MUST BE BINARY MODE.
fifo = open('tmp.fifo', 'wb')
i = 0
# Endlessly write to the FIFO.
while True:
# Try to write to the FIFO, restart the subprocess on demand, until succeeded.
while True:
try:
# Optimistically write to the FIFO.
fifo.write(f'hello{i}'.encode())
fifo.flush()
except IOError as e:
# The subprocess died. Close the FIFO and reap the subprocess.
print('IOError is occured.')
fifo.close()
os.kill(proc.pid, signal.SIGKILL)
proc.wait()
# Start the subprocess again.
proc = subprocess.Popen(['/bin/cat', 'tmp.fifo'], stdout=FNULL, stderr=FNULL)
print('pid = %d' % proc.pid)
fifo = open('tmp.fifo', 'wb')
else:
# The write goes on well.
break
os.kill(proc.pid, signal.SIGKILL)
i += 1
并将其另存为test.py
。
然后运行strace -o strace.out python3 test.py
在shell。检查 strace.out
,我们可以找到类似
的内容
openat(AT_FDCWD, "tmp.fifo", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 4
fstat(4, {st_mode=S_IFIFO|0644, st_size=0, ...}) = 0
ioctl(4, TCGETS, 0x7ffcba5cd290) = -1 ENOTTY (Inappropriate ioctl for device)
lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
write(4, "hello0", 6) = 6
kill(35626, SIGKILL) = 0
write(4, "hello1", 6) = 6
kill(35626, SIGKILL) = 0
write(4, "hello2", 6) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=35625, si_uid=1000} ---
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=35626, si_uid=1000, si_status=SIGKILL, si_utime=0, si_stime=0} ---
write(1, "IOError is occured.\n", 20) = 20
write(4, "hello2", 6) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=35625, si_uid=1000} ---
close(4) = 0
write(2, "Traceback (most recent call last"..., 35) = 35
write(2, " File \"test.py\", line 26, in <m"..., 39) = 39
请注意,Python 尝试写入 hello2
两次,分别在 fifo.flush
和 fifo.close
期间。输出解释了为什么会很好地生成两个 SIGPIPE 异常。
为了解决这个问题,我们可以使用open('tmp.fifo', 'wb', buffering=0)
禁用写缓冲区。那么只会产生一个SIGPIPE异常。
TL;DR:为什么关闭接收到 SIGPIPE 异常的 fifo 文件(命名管道)会产生另一个 SIGPIPE 异常?
我的 python 脚本正在通过 FIFO 文件将字节写入另一个进程,该进程是我的 python 进程的子进程。 (有一些限制,我必须使用命名管道。)
我必须考虑子进程可能提前终止的事实。如果发生这种情况,我的 python 脚本必须回收死掉的子进程并重新启动它。
要查看子进程是否终止,我只是先尝试写入 FIFO,如果我收到 SIGPIPE 异常(实际上是 IOError,表示管道损坏),我知道是时候重新启动我的子进程了。
最小示例如下:
#!/usr/bin/env python3
import os
import signal
import subprocess
# The FIFO file.
os.mkfifo('tmp.fifo')
# A subprocess to simply discard any input from the FIFO.
FNULL = open(os.devnull, 'w')
proc = subprocess.Popen(['/bin/cat', 'tmp.fifo'], stdout=FNULL, stderr=FNULL)
print('pid = %d' % proc.pid)
# Open the FIFO, and MUST BE BINARY MODE.
fifo = open('tmp.fifo', 'wb')
# Endlessly write to the FIFO.
while True:
# Try to write to the FIFO, restart the subprocess on demand, until succeeded.
while True:
try:
# Optimistically write to the FIFO.
fifo.write(b'hello')
except IOError as e:
# The subprocess died. Close the FIFO and reap the subprocess.
fifo.close()
os.kill(proc.pid, signal.SIGKILL)
proc.wait()
# Start the subprocess again.
proc = subprocess.Popen(['/bin/cat', 'tmp.fifo'], stdout=FNULL, stderr=FNULL)
print('pid = %d' % proc.pid)
fifo = open('tmp.fifo', 'wb')
else:
# The write goes on well.
break
要重现结果,运行 该脚本并通过 kill -9 <pid>
手动终止子进程。追溯将表明
Traceback (most recent call last):
File "./test.py", line 24, in <module>
fifo.write(b'hello')
BrokenPipeError: [Errno 32] Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./test.py", line 27, in <module>
fifo.close()
BrokenPipeError: [Errno 32] Broken pipe
那么为什么关闭 FIFO 文件会产生另一个 SIGPIPE 异常?
我运行在以下平台测试,结果相同
Python 3.7.6 @ Darwin Kernel Version 19.3.0 (MacOS 10.15.3)
Python 3.6.8 @ Linux 4.18.0-147.3.1.el8_1.x86_64 (Centos 8)
这是因为Python不会在fifo.write
失败时清除写入缓冲区。所以buffer在执行fifo.close
的时候会再次写入broken pipe,从而导致第二个SIGPIPE
.
在strace
的帮助下找到了原因。这里有一些细节。
首先,修改一小部分Python代码,如下,
#!/usr/bin/env python3
import os
import signal
import subprocess
# The FIFO file.
os.mkfifo('tmp.fifo')
# A subprocess to simply discard any input from the FIFO.
FNULL = open(os.devnull, 'w')
proc = subprocess.Popen(['/bin/cat', 'tmp.fifo'], stdout=FNULL, stderr=FNULL)
print('pid = %d' % proc.pid)
# Open the FIFO, and MUST BE BINARY MODE.
fifo = open('tmp.fifo', 'wb')
i = 0
# Endlessly write to the FIFO.
while True:
# Try to write to the FIFO, restart the subprocess on demand, until succeeded.
while True:
try:
# Optimistically write to the FIFO.
fifo.write(f'hello{i}'.encode())
fifo.flush()
except IOError as e:
# The subprocess died. Close the FIFO and reap the subprocess.
print('IOError is occured.')
fifo.close()
os.kill(proc.pid, signal.SIGKILL)
proc.wait()
# Start the subprocess again.
proc = subprocess.Popen(['/bin/cat', 'tmp.fifo'], stdout=FNULL, stderr=FNULL)
print('pid = %d' % proc.pid)
fifo = open('tmp.fifo', 'wb')
else:
# The write goes on well.
break
os.kill(proc.pid, signal.SIGKILL)
i += 1
并将其另存为test.py
。
然后运行strace -o strace.out python3 test.py
在shell。检查 strace.out
,我们可以找到类似
openat(AT_FDCWD, "tmp.fifo", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 4
fstat(4, {st_mode=S_IFIFO|0644, st_size=0, ...}) = 0
ioctl(4, TCGETS, 0x7ffcba5cd290) = -1 ENOTTY (Inappropriate ioctl for device)
lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
write(4, "hello0", 6) = 6
kill(35626, SIGKILL) = 0
write(4, "hello1", 6) = 6
kill(35626, SIGKILL) = 0
write(4, "hello2", 6) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=35625, si_uid=1000} ---
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=35626, si_uid=1000, si_status=SIGKILL, si_utime=0, si_stime=0} ---
write(1, "IOError is occured.\n", 20) = 20
write(4, "hello2", 6) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=35625, si_uid=1000} ---
close(4) = 0
write(2, "Traceback (most recent call last"..., 35) = 35
write(2, " File \"test.py\", line 26, in <m"..., 39) = 39
请注意,Python 尝试写入 hello2
两次,分别在 fifo.flush
和 fifo.close
期间。输出解释了为什么会很好地生成两个 SIGPIPE 异常。
为了解决这个问题,我们可以使用open('tmp.fifo', 'wb', buffering=0)
禁用写缓冲区。那么只会产生一个SIGPIPE异常。