在 for 循环中使用 Multiprocessing.Pool 的意外行为
Unexpected behavior using Multiprocessing.Pool inside for loop
这是我的代码:
import multiprocessing as mp
import numpy as np
def foo(p):
global i
return p*i
global lower, upper
lower = 1
upper = 4
for i in range(lower, upper):
if __name__ == '__main__':
dataset = np.linspace(1, 100, 100)
agents = mp.cpu_count() - 1
chunksize = 5
pool = mp.Pool(processes=agents)
result = pool.map(foo, dataset, chunksize)
print result
print i
pool.close()
pool.join()
控制台打印数组 [3, 6, 9,...,300] 三次,每次打印输出之间的整数为 1,2,3。所以我在 lower 和 upper(不包括在内)之间正确迭代,但我希望它先打印出数组 [1, 2, 3,...,100],然后是 [2, 4, 6,..., 200] 最后是 [3, 6, 9,...,300]。我不明白为什么它只将 i 的最终值传递给 foo 然后映射三次。
当你 运行 新进程时,这是它看到的:
import multiprocessing as mp
import numpy as np
def foo(p):
global i
return p*i
global lower, upper
lower = 1
upper = 4
for i in range(lower, upper):
if __name__ == '__main__':
# This part is not run, as
# in a different process,
# __name__ is set to '__mp_main__'
# i is now `upper - 1`, call `foo(p)` with the provided `p`
执行后,它被告知 运行 foo
(它必须再次 运行 整个脚本以找出 foo
是什么,只是因为酸洗的工作原理)
因此,在 运行 之后,i
将是 upper - 1
,并且它将始终 return p * 3
。
您想使 i
成为 foo
的参数,或某些多处理特定的内存共享对象,如 here
所述
将 i 设为本地并使用 functools.partial 可能会解决您的问题:
import multiprocessing as mp
import numpy as np
import functools
def foo(p,i):
return p*i
global lower, upper
lower = 1
upper = 4
for i in range(lower, upper):
if __name__ == '__main__':
dataset = np.linspace(1, 100, 100)
agents = mp.cpu_count() - 1
chunksize = 5
pool = mp.Pool(processes=agents)
foo2 = functools.partial(foo, i)
result = pool.map(foo2, dataset, chunksize)
print(result)
print(i)
pool.close()
pool.join()
这是我的代码:
import multiprocessing as mp
import numpy as np
def foo(p):
global i
return p*i
global lower, upper
lower = 1
upper = 4
for i in range(lower, upper):
if __name__ == '__main__':
dataset = np.linspace(1, 100, 100)
agents = mp.cpu_count() - 1
chunksize = 5
pool = mp.Pool(processes=agents)
result = pool.map(foo, dataset, chunksize)
print result
print i
pool.close()
pool.join()
控制台打印数组 [3, 6, 9,...,300] 三次,每次打印输出之间的整数为 1,2,3。所以我在 lower 和 upper(不包括在内)之间正确迭代,但我希望它先打印出数组 [1, 2, 3,...,100],然后是 [2, 4, 6,..., 200] 最后是 [3, 6, 9,...,300]。我不明白为什么它只将 i 的最终值传递给 foo 然后映射三次。
当你 运行 新进程时,这是它看到的:
import multiprocessing as mp
import numpy as np
def foo(p):
global i
return p*i
global lower, upper
lower = 1
upper = 4
for i in range(lower, upper):
if __name__ == '__main__':
# This part is not run, as
# in a different process,
# __name__ is set to '__mp_main__'
# i is now `upper - 1`, call `foo(p)` with the provided `p`
执行后,它被告知 运行 foo
(它必须再次 运行 整个脚本以找出 foo
是什么,只是因为酸洗的工作原理)
因此,在 运行 之后,i
将是 upper - 1
,并且它将始终 return p * 3
。
您想使 i
成为 foo
的参数,或某些多处理特定的内存共享对象,如 here
将 i 设为本地并使用 functools.partial 可能会解决您的问题:
import multiprocessing as mp
import numpy as np
import functools
def foo(p,i):
return p*i
global lower, upper
lower = 1
upper = 4
for i in range(lower, upper):
if __name__ == '__main__':
dataset = np.linspace(1, 100, 100)
agents = mp.cpu_count() - 1
chunksize = 5
pool = mp.Pool(processes=agents)
foo2 = functools.partial(foo, i)
result = pool.map(foo2, dataset, chunksize)
print(result)
print(i)
pool.close()
pool.join()