mpi4py是否使用Python Pickle
Does mpi4py use Python Pickle
根据mpi4py的解释,不知道是用pickle,还是比pickle效率高。最初,文件指出:
The pickle (slower, written in pure Python) and cPickle (faster,
written in C) modules provide user-extensible facilities to serialize
generic Python objects using ASCII or binary formats. The marshal
module provides facilities to serialize built-in Python objects using
a binary format specific to Python, but independent of machine
architecture issues.
基于此,pickle 似乎是最慢的方法。然后文件说:
MPI for Python can communicate any built-in or used-defined Python
object taking advantage of the features provided by the mod:pickle
module.
那么 MPI 使用的是最慢的选项 Pickle 吗?有更多的文字,但我没有看到一个直接的答案,也许它不是直接的实施?我是不是完全误解了它在说什么?
它尽其所能地使用 cPickle
,但是 falls back on Pickle
if necessary:
if PY_MAJOR_VERSION >= 3:
from pickle import dumps as PyPickle_dumps
from pickle import loads as PyPickle_loads
from pickle import DEFAULT_PROTOCOL as PyPickle_PROTOCOL
else:
try:
from cPickle import dumps as PyPickle_dumps
from cPickle import loads as PyPickle_loads
from cPickle import HIGHEST_PROTOCOL as PyPickle_PROTOCOL
except ImportError:
from pickle import dumps as PyPickle_dumps
from pickle import loads as PyPickle_loads
from pickle import HIGHEST_PROTOCOL as PyPickle_PROTOCOL
由于 pympi 包装器基于 MPI-2,对于并行 I/O 我猜它只在调用 MPI 之前使用 pickle 将数据转换为适当的格式(在每个进程上)writing/communication 在幕后运作。你引用的那一行之后说:
Blockquote
These facilities will be routinely used to build binary representations of
objects to communicate (at sending processes), and restoring them back (at receiving processes).
mpipy 文档建议尽可能使用 numpy 数组而不是 python 数据类型以提高效率。如果您的应用程序对速度至关重要,我建议始终使用 numpy 数组而不是 python 对象。 主要引入了 MPI 2.0 标准以提供高效的并行输入输出功能。使用 MPI(用 c 编写)很可能比 cpickle 或 pickle 更快,尤其是在许多进程上写入输出时。
根据mpi4py的解释,不知道是用pickle,还是比pickle效率高。最初,文件指出:
The pickle (slower, written in pure Python) and cPickle (faster, written in C) modules provide user-extensible facilities to serialize generic Python objects using ASCII or binary formats. The marshal module provides facilities to serialize built-in Python objects using a binary format specific to Python, but independent of machine architecture issues.
基于此,pickle 似乎是最慢的方法。然后文件说:
MPI for Python can communicate any built-in or used-defined Python object taking advantage of the features provided by the mod:pickle module.
那么 MPI 使用的是最慢的选项 Pickle 吗?有更多的文字,但我没有看到一个直接的答案,也许它不是直接的实施?我是不是完全误解了它在说什么?
它尽其所能地使用 cPickle
,但是 falls back on Pickle
if necessary:
if PY_MAJOR_VERSION >= 3:
from pickle import dumps as PyPickle_dumps
from pickle import loads as PyPickle_loads
from pickle import DEFAULT_PROTOCOL as PyPickle_PROTOCOL
else:
try:
from cPickle import dumps as PyPickle_dumps
from cPickle import loads as PyPickle_loads
from cPickle import HIGHEST_PROTOCOL as PyPickle_PROTOCOL
except ImportError:
from pickle import dumps as PyPickle_dumps
from pickle import loads as PyPickle_loads
from pickle import HIGHEST_PROTOCOL as PyPickle_PROTOCOL
由于 pympi 包装器基于 MPI-2,对于并行 I/O 我猜它只在调用 MPI 之前使用 pickle 将数据转换为适当的格式(在每个进程上)writing/communication 在幕后运作。你引用的那一行之后说:
Blockquote These facilities will be routinely used to build binary representations of objects to communicate (at sending processes), and restoring them back (at receiving processes).
mpipy 文档建议尽可能使用 numpy 数组而不是 python 数据类型以提高效率。如果您的应用程序对速度至关重要,我建议始终使用 numpy 数组而不是 python 对象。 主要引入了 MPI 2.0 标准以提供高效的并行输入输出功能。使用 MPI(用 c 编写)很可能比 cpickle 或 pickle 更快,尤其是在许多进程上写入输出时。