Pygraphviz 在绘制 170 个图形后崩溃
Pygraphviz crashes after drawing 170 graphs
我正在使用 pygraphviz 为不同的数据配置创建大量图表。我发现无论在图形中输入什么信息,程序都会在绘制第 170 个图形后崩溃。没有生成错误消息程序只是停止。画这么多图有什么需要重新设置的吗?
我在 运行 Python 3.7 Windows 10 机器上,Pygraphviz 1.5 和 graphviz 2.38
for graph_number in range(200):
config_graph = pygraphviz.AGraph(strict=False, directed=False, compound=True, ranksep='0.2', nodesep='0.2')
# Create Directory
if not os.path.exists('Graph'):
os.makedirs('Graph')
# Draw Graph
print('draw_' + str(graph_number))
config_graph.layout(prog = 'dot')
config_graph.draw('Graph/'+str(graph_number)+'.png')
我试过你的代码,它生成了 200 个图表没有问题(我也试过 2000 个)。
我的建议是使用这些版本的软件包,我在 mac os 上安装了一个 conda 环境 python 3.7 :
graphviz 2.40.1 hefbbd9a_2
pygraphviz 1.3 py37h1de35cc_1
我能够通过以下方式不断重现该行为:
- Python 3.7.6 (pc064 (64bit), 然后还有 pc032)
- PyGraphviz 1.5(我构建的 - 可在 [GitHub]: CristiFati/Prebuilt-Binaries - Various software built on various platforms. (under PyGraphviz, naturally) - might also want to check [SO]: Installing pygraphviz on Windows 10 64-bit, Python 3.6 (@CristiFati's answer) 下载)
- Graphviz 2.42.2 ((pc032) 与 #2 相同。)
我怀疑代码中某处存在未定义行为(UB),即使行为完全相同:
- OK 169 图
- 崩溃 170
做了一些调试(在 agraph.py 和 中添加了一些 print(f) 语句cgraph.dll(write.c)。
PyGraphviz 为许多操作调用 Graphviz 的工具 (.exes)。为此,它使用 subprocess.Popen 并通过其 3 个可用流(stdin、stdout 与子进程通信, stderr).
从一开始我就注意到 170 * 3 = 510
(非常接近 512 (0x200)),但没有直到后来我应该尽可能多地注意(主要是因为 Python 过程(运行 下面的代码) 不超过 ~150 在 任务管理器 (TM) 和 Process Explorer (PE)).
然而,一点 Googleing 显示:
[SO]: Is there a limit on number of open files in Windows (@stackprogrammer's answer)(从这里开始)
[MS.Docs]: _setmaxstdio(声明(重点是我的)):
C run-time I/O now supports up to 8,192 files open simultaneously at the low I/O level. This level includes files opened and accessed using the _open, _read, and _write family of I/O functions. By default, up to 512 files can be open simultaneously at the stream I/O level. This level includes files opened and accessed using the fopen, fgetc, and fputc family of functions. The limit of 512 open files at the stream I/O level can be increased to a maximum of 8,192 by use of the _setmaxstdio function.
[SO]: Python: Which command increases the number of open files on Windows? (@NorthCat's answer)
以下是我为调试和重现错误而修改的代码。它需要(为了代码简洁,同样的事情可以通过 CTypes 实现)PyWin32 包(python -m pip install pywin32
)。
code00.py:
#!/usr/bin/env python
import sys
import os
#import time
import pygraphviz as pgv
import win32file as wfile
def handle_graph(idx, dir_name):
graph_name = "draw_{0:03d}".format(idx)
graph_args = {
"name": graph_name,
"strict": False,
"directed": False,
"compound": True,
"ranksep": "0.2",
"nodesep": "0.2",
}
graph = pgv.AGraph(**graph_args)
# Draw Graph
img_base_name = graph_name + ".png"
print(" {0:s}".format(img_base_name))
graph.layout(prog="dot")
img_full_name = os.path.join(dir_name, img_base_name)
graph.draw(img_full_name)
graph.close() # !!! Has NO (visible) effect, but I think it should be called anyway !!!
def main(*argv):
print("OLD max open files: {0:d}".format(wfile._getmaxstdio()))
# 513 is enough for your original code (170 graphs), but you can set it up to 8192
wfile._setmaxstdio(513) # !!! COMMENT this line to reproduce the crash !!!
print("NEW max open files: {0:d}".format(wfile._getmaxstdio()))
dir_name = "Graph"
# Create Directory
if not os.path.isdir(dir_name):
os.makedirs(dir_name)
#ts_global_start = time.time()
start = 0
count = 169
#count = 1
step_sleep = 0.05
for i in range(start, start + count):
#ts_local_start = time.time()
handle_graph(i, dir_name)
#print(" Time: {0:.3f}".format(time.time() - ts_local_start))
#time.sleep(step_sleep)
handle_graph(count, dir_name)
#print("Global time: {0:.3f}".format(time.time() - ts_global_start - step_sleep * count))
if __name__ == "__main__":
print("Python {0:s} {1:d}bit on {2:s}\n".format(" ".join(item.strip() for item in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform))
main(*sys.argv[1:])
print("\nDone.")
输出:
e:\Work\Dev\Whosebug\q060876623>"e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe" code00.py
Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 64bit on win32
OLD max open files: 512
NEW max open files: 513
draw_000.png
draw_001.png
draw_002.png
...
draw_167.png
draw_168.png
draw_169.png
Done.
结论:
- 显然,一些文件句柄 (fds) 是打开的,尽管 TM 或 [=54 看不到它们=]PE(可能他们处于较低的水平)。但是我不知道为什么会这样(是 MS UCRT 错误吗?),但据我所知,一旦子进程结束,它的流应该关闭,但我不知道如何强制执行它(这将是一个正确的解决方法)
- 此外,尝试写入(未打开)时的行为(崩溃)到fd(超出限制),似乎有点st运行ge
- 作为解决方法,可以增加 max 打开 fds 的数量。根据以下不等式:
3 * (graph_count + 1) <= max_fds
,您可以对数字有所了解。从那里开始,如果您将限制设置为 8192(我没有对此进行测试),您应该能够处理 2729图(假设没有额外的fd被代码打开)
旁注:
在调查过程中,我 运行 发现或注意到几个相邻的问题,我试图解决这些问题:
此行为还有一个未解决的问题(可能是同一作者):[GitHub]: pygraphviz/pygraphviz - Pygraphviz crashes after drawing 170 graphs
我正在使用 pygraphviz 为不同的数据配置创建大量图表。我发现无论在图形中输入什么信息,程序都会在绘制第 170 个图形后崩溃。没有生成错误消息程序只是停止。画这么多图有什么需要重新设置的吗?
我在 运行 Python 3.7 Windows 10 机器上,Pygraphviz 1.5 和 graphviz 2.38
for graph_number in range(200):
config_graph = pygraphviz.AGraph(strict=False, directed=False, compound=True, ranksep='0.2', nodesep='0.2')
# Create Directory
if not os.path.exists('Graph'):
os.makedirs('Graph')
# Draw Graph
print('draw_' + str(graph_number))
config_graph.layout(prog = 'dot')
config_graph.draw('Graph/'+str(graph_number)+'.png')
我试过你的代码,它生成了 200 个图表没有问题(我也试过 2000 个)。
我的建议是使用这些版本的软件包,我在 mac os 上安装了一个 conda 环境 python 3.7 :
graphviz 2.40.1 hefbbd9a_2
pygraphviz 1.3 py37h1de35cc_1
我能够通过以下方式不断重现该行为:
- Python 3.7.6 (pc064 (64bit), 然后还有 pc032)
- PyGraphviz 1.5(我构建的 - 可在 [GitHub]: CristiFati/Prebuilt-Binaries - Various software built on various platforms. (under PyGraphviz, naturally) - might also want to check [SO]: Installing pygraphviz on Windows 10 64-bit, Python 3.6 (@CristiFati's answer) 下载)
- Graphviz 2.42.2 ((pc032) 与 #2 相同。)
我怀疑代码中某处存在未定义行为(UB),即使行为完全相同:
- OK 169 图
- 崩溃 170
做了一些调试(在 agraph.py 和 中添加了一些 print(f) 语句cgraph.dll(write.c)。
PyGraphviz 为许多操作调用 Graphviz 的工具 (.exes)。为此,它使用 subprocess.Popen 并通过其 3 个可用流(stdin、stdout 与子进程通信, stderr).
从一开始我就注意到 170 * 3 = 510
(非常接近 512 (0x200)),但没有直到后来我应该尽可能多地注意(主要是因为 Python 过程(运行 下面的代码) 不超过 ~150 在 任务管理器 (TM) 和 Process Explorer (PE)).
然而,一点 Googleing 显示:
[SO]: Is there a limit on number of open files in Windows (@stackprogrammer's answer)(从这里开始)
[MS.Docs]: _setmaxstdio(声明(重点是我的)):
C run-time I/O now supports up to 8,192 files open simultaneously at the low I/O level. This level includes files opened and accessed using the _open, _read, and _write family of I/O functions. By default, up to 512 files can be open simultaneously at the stream I/O level. This level includes files opened and accessed using the fopen, fgetc, and fputc family of functions. The limit of 512 open files at the stream I/O level can be increased to a maximum of 8,192 by use of the _setmaxstdio function.
[SO]: Python: Which command increases the number of open files on Windows? (@NorthCat's answer)
以下是我为调试和重现错误而修改的代码。它需要(为了代码简洁,同样的事情可以通过 CTypes 实现)PyWin32 包(python -m pip install pywin32
)。
code00.py:
#!/usr/bin/env python
import sys
import os
#import time
import pygraphviz as pgv
import win32file as wfile
def handle_graph(idx, dir_name):
graph_name = "draw_{0:03d}".format(idx)
graph_args = {
"name": graph_name,
"strict": False,
"directed": False,
"compound": True,
"ranksep": "0.2",
"nodesep": "0.2",
}
graph = pgv.AGraph(**graph_args)
# Draw Graph
img_base_name = graph_name + ".png"
print(" {0:s}".format(img_base_name))
graph.layout(prog="dot")
img_full_name = os.path.join(dir_name, img_base_name)
graph.draw(img_full_name)
graph.close() # !!! Has NO (visible) effect, but I think it should be called anyway !!!
def main(*argv):
print("OLD max open files: {0:d}".format(wfile._getmaxstdio()))
# 513 is enough for your original code (170 graphs), but you can set it up to 8192
wfile._setmaxstdio(513) # !!! COMMENT this line to reproduce the crash !!!
print("NEW max open files: {0:d}".format(wfile._getmaxstdio()))
dir_name = "Graph"
# Create Directory
if not os.path.isdir(dir_name):
os.makedirs(dir_name)
#ts_global_start = time.time()
start = 0
count = 169
#count = 1
step_sleep = 0.05
for i in range(start, start + count):
#ts_local_start = time.time()
handle_graph(i, dir_name)
#print(" Time: {0:.3f}".format(time.time() - ts_local_start))
#time.sleep(step_sleep)
handle_graph(count, dir_name)
#print("Global time: {0:.3f}".format(time.time() - ts_global_start - step_sleep * count))
if __name__ == "__main__":
print("Python {0:s} {1:d}bit on {2:s}\n".format(" ".join(item.strip() for item in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform))
main(*sys.argv[1:])
print("\nDone.")
输出:
e:\Work\Dev\Whosebug\q060876623>"e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe" code00.py Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 64bit on win32 OLD max open files: 512 NEW max open files: 513 draw_000.png draw_001.png draw_002.png ... draw_167.png draw_168.png draw_169.png Done.
结论:
- 显然,一些文件句柄 (fds) 是打开的,尽管 TM 或 [=54 看不到它们=]PE(可能他们处于较低的水平)。但是我不知道为什么会这样(是 MS UCRT 错误吗?),但据我所知,一旦子进程结束,它的流应该关闭,但我不知道如何强制执行它(这将是一个正确的解决方法)
- 此外,尝试写入(未打开)时的行为(崩溃)到fd(超出限制),似乎有点st运行ge
- 作为解决方法,可以增加 max 打开 fds 的数量。根据以下不等式:
3 * (graph_count + 1) <= max_fds
,您可以对数字有所了解。从那里开始,如果您将限制设置为 8192(我没有对此进行测试),您应该能够处理 2729图(假设没有额外的fd被代码打开)
旁注:
在调查过程中,我 运行 发现或注意到几个相邻的问题,我试图解决这些问题:
此行为还有一个未解决的问题(可能是同一作者):[GitHub]: pygraphviz/pygraphviz - Pygraphviz crashes after drawing 170 graphs