在 Python 中使用 multiprocessing.Pool 和返回自定义对象的函数
Using multiprocessing.Pool in Python with a function returning custom object
我正在使用 multiprocessing.Pool 来加速计算,因为我多次调用一个函数,然后整理结果。这是我的代码片段:
import multiprocessing
from functools import partial
def Foo(id:int,constant_arg1:str, constant_arg2:str):
custom_class_obj = CustomClass(constant_arg1, constant_arg2)
custom_class_obj.run() # this changes some attributes of the custom_class_obj
if(something):
return None
else:
return [custom_class_obj]
def parallel_run(iters:int, a:str, b:str):
pool = multiprocessing.Pool(processes=k)
## create the partial function obj before passing it to pool
partial_func = partial(Foo, constant_arg1=a, constant_arg2=b)
## create the variable id list
iter_list = list(range(iters))
all_runs = pool.map(partial_func, iter_list)
return all_runs
这会在多处理模块中引发以下错误:
multiprocessing.pool.MaybeEncodingError: Error sending result: '[[<CustomClass object at 0x1693c7070>], [<CustomClass object at 0x1693b88e0>], ....]'
Reason: 'TypeError("cannot pickle 'module' object")'
我该如何解决这个问题?
我能够用一个不可腌制的最小示例复制错误消息 class。该错误基本上表明您的 class 实例无法被腌制,因为它包含对模块的引用,并且模块不可腌制。您需要梳理 CustomClass
以确保实例不包含打开的文件句柄、模块引用等内容。如果您需要这些内容,则应使用 __getstate__
和 __setstate__
到 customize the pickle and unpickle process.
您的错误的提炼示例:
from multiprocessing import Pool
from functools import partial
class klass:
def __init__(self, a):
self.value = a
import os
self.module = os #this fails: can't pickle a module and send it back to main process
def foo(a, b, c):
return klass(a+b+c)
if __name__ == "__main__":
with Pool() as p:
a = 1
b = 2
bar = partial(foo, a, b)
res = p.map(bar, range(10))
print([r.value for r in res])
我正在使用 multiprocessing.Pool 来加速计算,因为我多次调用一个函数,然后整理结果。这是我的代码片段:
import multiprocessing
from functools import partial
def Foo(id:int,constant_arg1:str, constant_arg2:str):
custom_class_obj = CustomClass(constant_arg1, constant_arg2)
custom_class_obj.run() # this changes some attributes of the custom_class_obj
if(something):
return None
else:
return [custom_class_obj]
def parallel_run(iters:int, a:str, b:str):
pool = multiprocessing.Pool(processes=k)
## create the partial function obj before passing it to pool
partial_func = partial(Foo, constant_arg1=a, constant_arg2=b)
## create the variable id list
iter_list = list(range(iters))
all_runs = pool.map(partial_func, iter_list)
return all_runs
这会在多处理模块中引发以下错误:
multiprocessing.pool.MaybeEncodingError: Error sending result: '[[<CustomClass object at 0x1693c7070>], [<CustomClass object at 0x1693b88e0>], ....]'
Reason: 'TypeError("cannot pickle 'module' object")'
我该如何解决这个问题?
我能够用一个不可腌制的最小示例复制错误消息 class。该错误基本上表明您的 class 实例无法被腌制,因为它包含对模块的引用,并且模块不可腌制。您需要梳理 CustomClass
以确保实例不包含打开的文件句柄、模块引用等内容。如果您需要这些内容,则应使用 __getstate__
和 __setstate__
到 customize the pickle and unpickle process.
您的错误的提炼示例:
from multiprocessing import Pool
from functools import partial
class klass:
def __init__(self, a):
self.value = a
import os
self.module = os #this fails: can't pickle a module and send it back to main process
def foo(a, b, c):
return klass(a+b+c)
if __name__ == "__main__":
with Pool() as p:
a = 1
b = 2
bar = partial(foo, a, b)
res = p.map(bar, range(10))
print([r.value for r in res])