MPI：一遍又一遍地并行化缓冲区

Question

假设我有一个大的输入文件。

假设这个文件有我想并行处理的项目。

std::vector<std::string> items(100000,"");
for(int i = 0; i < 1000000; i++)
    items[i] = pop_item(file);

接下来，我想通过与 MPI 并行处理这些项目来加快处理速度：

std::vector<MyObj> processed_items(100000); // pseudo-code, i handle the memory mallocing
int size; rank;
MPI_INIT();

MPI_Comm_size(MPI_COMM_WORLD,&size);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);

for(i = rank; i < 100000; i += size)
    processed_items[i] = process_item(items[i]);

MPI_FINALIZE();

好的，很好，它有效。

现在，我想在 while 循环中一遍又一遍地做：

while(!done){
   done = fill_items(&items, file); 

   MPI_INIT();

   ...;

   MPI_FINALIZE();

   print_items(&processed_items);

}

但是，我失败了 "error: mpi_init called after mpi finalize invoked."

我在 MPI 中处理此问题的预期方式是什么？

Answer 1

MPI_INIT() 和 MPI_FINALIZE 每个程序只能调用一次，正如您的错误提示。 This old answer 半年前概述了如何将 MPI 并行地用于运行程序的某些部分：

int main(int argc, char *argv[]) {
    MPI_Init(&argc, &argv);  
    MPI_Comm_size(MPI_COMM_WORLD,&numprocs);  
    MPI_Comm_rank(MPI_COMM_WORLD,&myid);

    if (myid == 0) { // Do the serial part on a single MPI thread
        printf("Performing serial computation on cpu %d\n", myid);
        PreParallelWork();
    }

    ParallelWork();  // Every MPI thread will run the parallel work

    if (myid == 0) { // Do the final serial part on a single MPI thread
        printf("Performing the final serial computation on cpu %d\n", myid);
        PostParallelWork();
    }

    MPI_Finalize();  
    return 0;  
}

MPI：一遍又一遍地并行化缓冲区

MPI: parallelize a buffer over and over

c++

mpi