该程序中的某些进程是否可能比其他进程完成得更快?
Is it possible that some processes in this program finish sooner than others?
我有一个设计为高度可并行化的程序。我怀疑某些处理器比其他处理器更早完成这个 Python 脚本,这可以解释我在这段代码的上游观察到的行为。此代码是否可能允许某些 mpi 进程比其他进程更快完成?
dacout = 'output_file.out'
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
nam ='lcoe.coe'
csize = 10000
with open(dacout) as f:
for i,l in enumerate(f):
pass
numlines = i
dakchunks = pd.read_csv(dacout, skiprows=0, chunksize = csize, sep='there_are_no_seperators')
linespassed = 0
vals = {}
for dchunk in dakchunks:
for line in dchunk.values:
linespassed += 1
if linespassed < 49 or linespassed > numlines - 50: continue
else:
split_line = ''.join(str(s) for s in line).split()
if len(split_line)==2:
if split_line[0] == 'nan' or split_line[0] == '-nan': continue
if split_line[1] != nam: continue
if split_line[1] not in vals:
try: vals[split_line[1]] = [float(split_line[0])]
except NameError: continue
else:vals[split_line[1]].append(float(split_line[0]))
# Calculate mean and x s.t. Percentile_x(coe_dat)<threshold_coe
self.coe_vals = sorted(vals[nam])
self.mean_coe = np.mean(self.coe_vals)
self.p90 = np.percentile(self.coe_vals, 90)
self.p95 = np.percentile(self.coe_vals, 95)
count_vals = 0.00
for i in self.coe_vals:
count_vals += 1
if i > coe_threshold: break
self.perc = 100 * (count_vals/len(self.coe_vals))
if rank==0:
print>>logf, self.rp, self.rd, self.hh, self.mean_coe
print self.rp, self.rd, self.hh, self.mean_coe, self.p90, self.perc
在您发布的代码中,所有进程都在读取同一个文件并计算同一个东西。但是唯一打印结果的进程是进程0。这不是并行计算,这是多次做同样的事情!
一些进程可以先于其他进程完成此脚本,因为该脚本没有以障碍结束。使用comm.barrier()
同步通信器comm
的所有进程。只在必要时才做:障碍会损害性能...
我有一个设计为高度可并行化的程序。我怀疑某些处理器比其他处理器更早完成这个 Python 脚本,这可以解释我在这段代码的上游观察到的行为。此代码是否可能允许某些 mpi 进程比其他进程更快完成?
dacout = 'output_file.out'
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
nam ='lcoe.coe'
csize = 10000
with open(dacout) as f:
for i,l in enumerate(f):
pass
numlines = i
dakchunks = pd.read_csv(dacout, skiprows=0, chunksize = csize, sep='there_are_no_seperators')
linespassed = 0
vals = {}
for dchunk in dakchunks:
for line in dchunk.values:
linespassed += 1
if linespassed < 49 or linespassed > numlines - 50: continue
else:
split_line = ''.join(str(s) for s in line).split()
if len(split_line)==2:
if split_line[0] == 'nan' or split_line[0] == '-nan': continue
if split_line[1] != nam: continue
if split_line[1] not in vals:
try: vals[split_line[1]] = [float(split_line[0])]
except NameError: continue
else:vals[split_line[1]].append(float(split_line[0]))
# Calculate mean and x s.t. Percentile_x(coe_dat)<threshold_coe
self.coe_vals = sorted(vals[nam])
self.mean_coe = np.mean(self.coe_vals)
self.p90 = np.percentile(self.coe_vals, 90)
self.p95 = np.percentile(self.coe_vals, 95)
count_vals = 0.00
for i in self.coe_vals:
count_vals += 1
if i > coe_threshold: break
self.perc = 100 * (count_vals/len(self.coe_vals))
if rank==0:
print>>logf, self.rp, self.rd, self.hh, self.mean_coe
print self.rp, self.rd, self.hh, self.mean_coe, self.p90, self.perc
在您发布的代码中,所有进程都在读取同一个文件并计算同一个东西。但是唯一打印结果的进程是进程0。这不是并行计算,这是多次做同样的事情!
一些进程可以先于其他进程完成此脚本,因为该脚本没有以障碍结束。使用comm.barrier()
同步通信器comm
的所有进程。只在必要时才做:障碍会损害性能...