在 MPI 中应用 Reduce 和 Broadcast
Applying Reduce and Broadcast in MPI
我正在设计一个 Python 程序来计算 n 的某个固定值在 k < n < k^2 之间的素数个数。我的原始实现通过按处理器数量拆分计算来计算素数的数量。
我的意图是应用 reduce
和 broadcast
命令来并行化所有处理器之间的工作。我是 MPI 的新手,不知道该怎么做,所以我在代码中注释掉了这两行。我很好奇是否应该遵循特定的方向来完成 MPI 中的 reduce
和 broadcast
命令。我的原码是
import numpy as np
import platform
import sys
from mpi4py import MPI
comm = MPI.COMM_WORLD
id = comm.Get_rank( )
p = comm.Get_size( )
p=1
# Find the primes between 2 and k. Initialize k.
k=10
# Define a list S_k of the primes between 2 and k
primes=[]
# Define a list to store numbers that aren't prime between 2 and k.
not_prime = []
# Define a list S_k2 of the primes between k and k**2
primes2=[]
# Count the number of primes from 2 to k
for i in range(2, k+1):
if i not in not_prime:
primes.append(i)
for j in range(i*i, k+1, i):
not_prime.append(j)
# Find the number of primes between k and k**2
b=(k**2-k)/p
for n in range(int(k+(p-1)*b),int(k+(p)*b)):
counter = 0
for i in range(len(primes)):
if (n % primes[i]) == 0:
break
else:
counter = counter + 1
if (counter==len(S_k)):
primes2.append(n)
# I'm not sure what to use as parameters for comm.Reduce and comm.bcast
# comm.reduce = (primes2, op = MPI.SUM, root = 0 )
# comm.bcast |= (primes2, op = MPI.SUM, root = 0 )
print ("Number of processors: ",p)
print (primes2)
print((int(k+(p-1)*b),int(k+(p)*b)))
针对 p=1 进行测试时,代码 returns
Number of processors: 1
[11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
(10, 100)
针对 p=2 进行测试时,代码 returns
Number of processors: 2
[59, 61, 67, 71, 73, 79, 83, 89, 97]
(55, 100)
针对 p=3 进行测试时,代码 returns
Number of processors: 3
[71, 73, 79, 83, 89, 97]
(70, 100)
并且随着 p 的增加,primes2
中素数的数量减少。
理想情况下,我正在研究如何并行化代码以便针对 p=3 进行测试 returns
Number of processors: 3
Processor 1 computed [11, 13, 17, 19, 23, 29, 31, 37]
Processor 2 computed [41, 43, 47, 53, 59, 61, 67]
Processor 3 computed [71, 73, 79, 83, 89, 97]
我认为这可以通过为 MPI 应用 reduce 和 broadcast 命令来完成。我不确定我应该如何调整
# comm.reduce = (primes2, op = MPI.SUM, root = 0 )
# comm.bcast |= (primes2, op = MPI.SUM, root = 0 )
以便各个处理器计算素数的子集。
#mpiexec -n 3 python n_loop.py
import numpy as np
import platform
import sys
from mpi4py import MPI
comm = MPI.COMM_WORLD
id = comm.Get_rank( )
p = comm.Get_size( )
# Find the primes between 2 and k. Initialize k.
k=20
# Define a list S_k of the primes between 2 and k
S_k=[]
# Define a list to store numbers that aren't prime between 2 and k.
not_prime = []
# Define a list S_k2 of the primes between k and k**2
S_k2=[]
# Count the number of primes from 2 to k
for i in range(2, k+1):
if i not in not_prime:
S_k.append(i)
for j in range(i*i, k+1, i):
not_prime.append(j)
# Find the number of primes between k and k**2 by
# pararllelizing the n-loop.
b=(k**2-k)/p
for n in range(int(k+(id)*b),int(k+(id+1)*b)):
counter = 0
for i in range(len(S_k)):
if (n % S_k[i]) == 0:
break
else:
counter = counter + 1
if (counter==len(S_k)):
S_k2.append(n)
# Compute the amount of primes in the two lists.
processor_num_primes = len(S_k2)
original_num_primes = len(S_k)
# Broadcast the amount of primes and calculate the unions
# of sets of integers.
countb = 0
totalb = 0
for i in range(0,p):
countb = comm.bcast(S_k2,i)
totalb = totalb + len(countb)
total = comm.reduce(totalb,MPI.BOR,0)
if (id == 0):
print ("Total number of processors: ",p)
print ("Processor ",id, "is calculating:", S_k2)
print ("Processor ",id, "is calculating this many primes :",processor_num_primes)
print("Processor ",id,"has the following range:",(int(k+(id)*b),int(k+(id+1)*b)))
if (id == 0):
print("The total number of primes found between",2,"and",k*k,"by the n-"
"loop is:",total+original_num_primes)
print('')
我正在设计一个 Python 程序来计算 n 的某个固定值在 k < n < k^2 之间的素数个数。我的原始实现通过按处理器数量拆分计算来计算素数的数量。
我的意图是应用 reduce
和 broadcast
命令来并行化所有处理器之间的工作。我是 MPI 的新手,不知道该怎么做,所以我在代码中注释掉了这两行。我很好奇是否应该遵循特定的方向来完成 MPI 中的 reduce
和 broadcast
命令。我的原码是
import numpy as np
import platform
import sys
from mpi4py import MPI
comm = MPI.COMM_WORLD
id = comm.Get_rank( )
p = comm.Get_size( )
p=1
# Find the primes between 2 and k. Initialize k.
k=10
# Define a list S_k of the primes between 2 and k
primes=[]
# Define a list to store numbers that aren't prime between 2 and k.
not_prime = []
# Define a list S_k2 of the primes between k and k**2
primes2=[]
# Count the number of primes from 2 to k
for i in range(2, k+1):
if i not in not_prime:
primes.append(i)
for j in range(i*i, k+1, i):
not_prime.append(j)
# Find the number of primes between k and k**2
b=(k**2-k)/p
for n in range(int(k+(p-1)*b),int(k+(p)*b)):
counter = 0
for i in range(len(primes)):
if (n % primes[i]) == 0:
break
else:
counter = counter + 1
if (counter==len(S_k)):
primes2.append(n)
# I'm not sure what to use as parameters for comm.Reduce and comm.bcast
# comm.reduce = (primes2, op = MPI.SUM, root = 0 )
# comm.bcast |= (primes2, op = MPI.SUM, root = 0 )
print ("Number of processors: ",p)
print (primes2)
print((int(k+(p-1)*b),int(k+(p)*b)))
针对 p=1 进行测试时,代码 returns
Number of processors: 1
[11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
(10, 100)
针对 p=2 进行测试时,代码 returns
Number of processors: 2
[59, 61, 67, 71, 73, 79, 83, 89, 97]
(55, 100)
针对 p=3 进行测试时,代码 returns
Number of processors: 3
[71, 73, 79, 83, 89, 97]
(70, 100)
并且随着 p 的增加,primes2
中素数的数量减少。
理想情况下,我正在研究如何并行化代码以便针对 p=3 进行测试 returns
Number of processors: 3
Processor 1 computed [11, 13, 17, 19, 23, 29, 31, 37]
Processor 2 computed [41, 43, 47, 53, 59, 61, 67]
Processor 3 computed [71, 73, 79, 83, 89, 97]
我认为这可以通过为 MPI 应用 reduce 和 broadcast 命令来完成。我不确定我应该如何调整
# comm.reduce = (primes2, op = MPI.SUM, root = 0 )
# comm.bcast |= (primes2, op = MPI.SUM, root = 0 )
以便各个处理器计算素数的子集。
#mpiexec -n 3 python n_loop.py
import numpy as np
import platform
import sys
from mpi4py import MPI
comm = MPI.COMM_WORLD
id = comm.Get_rank( )
p = comm.Get_size( )
# Find the primes between 2 and k. Initialize k.
k=20
# Define a list S_k of the primes between 2 and k
S_k=[]
# Define a list to store numbers that aren't prime between 2 and k.
not_prime = []
# Define a list S_k2 of the primes between k and k**2
S_k2=[]
# Count the number of primes from 2 to k
for i in range(2, k+1):
if i not in not_prime:
S_k.append(i)
for j in range(i*i, k+1, i):
not_prime.append(j)
# Find the number of primes between k and k**2 by
# pararllelizing the n-loop.
b=(k**2-k)/p
for n in range(int(k+(id)*b),int(k+(id+1)*b)):
counter = 0
for i in range(len(S_k)):
if (n % S_k[i]) == 0:
break
else:
counter = counter + 1
if (counter==len(S_k)):
S_k2.append(n)
# Compute the amount of primes in the two lists.
processor_num_primes = len(S_k2)
original_num_primes = len(S_k)
# Broadcast the amount of primes and calculate the unions
# of sets of integers.
countb = 0
totalb = 0
for i in range(0,p):
countb = comm.bcast(S_k2,i)
totalb = totalb + len(countb)
total = comm.reduce(totalb,MPI.BOR,0)
if (id == 0):
print ("Total number of processors: ",p)
print ("Processor ",id, "is calculating:", S_k2)
print ("Processor ",id, "is calculating this many primes :",processor_num_primes)
print("Processor ",id,"has the following range:",(int(k+(id)*b),int(k+(id+1)*b)))
if (id == 0):
print("The total number of primes found between",2,"and",k*k,"by the n-"
"loop is:",total+original_num_primes)
print('')