mpi_gather,c 中的二维动态数组,在信号 6 上退出(中止)

mpi_gather, 2d dynamic array in c, exited on signal 6 (aborted)

经过搜索和搜索,我终于找到了为 nD 数组(如向量或线性)分配内存的函数。
函数是:

int malloc2dint(int ***array, int n, int m) 
{
    /* allocate the n*m contiguous items */
    int *p = (int *)malloc(n*m*sizeof(int));
    if (!p) return -1;

    /* allocate the row pointers into the memory */
    (*array) = (int **)malloc(n*sizeof(int*));
    if (!(*array)) 
    {
        free(p);
        return -1;
    }

    /* set up the pointers into the contiguous memory */
    int i;
    for (i=0; i<n; i++) 
        (*array)[i] = &(p[i*m]);

    return 0;
}  

通过使用此方法,我可以正确地广播和分散二维动态分配数组,但 MPI_Gather 中的问题仍然存在。
主要功能是:

int length = atoi(argv[1]);
int rank, size, from, to, i, j, k, **first_array, **second_array, **result_array;

MPI_Init (&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);

//2D dynamic memory allocation
malloc2dint(&first_array, length, length);
malloc2dint(&second_array, length, length);
malloc2dint(&result_array, length, length);

//Related boundary to each task
from = rank * length/size;
to = (rank+1) * length/size;

//Intializing first and second array
if (rank==0) 
{
    for(i=0; i<length; i++)
        for(j=0; j<length; j++)
        {
            first_array[i][j] = 1;
            second_array[i][j] = 1;
        }
}

//Broadcast second array so all tasks will have it
MPI_Bcast (&(second_array[0][0]), length*length, MPI_INT, 0, MPI_COMM_WORLD);

//Scatter first array so each task has matrix values between its boundary
MPI_Scatter (&(first_array[0][0]), length*(length/size), MPI_INT, first_array[from], length*(length/size), MPI_INT, 0, MPI_COMM_WORLD);


//Now each task will calculate matrix multiplication for its part
for (i=from; i<to; i++) 
    for (j=0; j<length; j++) 
    {
        result_array[i][j]=0;
        for (k=0; k<length; k++)
            result_array[i][j] += first_array[i][k]*second_array[k][j];

        //printf("\nrank(%d)->result_array[%d][%d] = %d\n", rank, i, j, result_array[i][j]);
        //this line print the correct value
    }

//Gathering info from all task and put each partition to resulat_array
MPI_Gather (&(result_array[from]), length*(length/size), MPI_INT, result_array, length*(length/size), MPI_INT, 0, MPI_COMM_WORLD);

if (rank==0) 
{
    for (i=0; i<length; i++) 
    {
        printf("\n\t| ");
        for (j=0; j<length; j++)
            printf("%2d ", result_array[i][j]);
        printf("|\n");
    }
}

MPI_Finalize();
return 0;  

现在当我 运行 mpirun -np 2 xxx.out 4 输出是:

|  4  4  4  4 | ---> Good Job!

|  4  4  4  4 | ---> Good Job!

| 1919252078 1852795251 1868524912 778400882 | ---> Where are you baby?!!!

| 540700531 1701080693 1701734758 2037588068 | ---> Where are you baby?!!!

最后 mpi运行 注意到进程 rank 0 在信号 6 上退出(中止)。
对我来说奇怪的一点是 MPI_BcastMPI_Scatter 工作正常但 MPI_Gather 不行。
任何帮助将不胜感激

问题在于您如何传递缓冲区。您在 MPI_Scatter 中做对了,但在 MPI_Gather.

中做错了

通过&result_array[from] 传递result_array 将读取保存指针列表的内存,而不是矩阵的实际数据。请改用 &result_array[from][0]

接收缓冲区也是如此。传递 &result_array[0][0] 而不是 result_array 以传递指向数据在内存中的位置的指针。

因此,而不是:

//Gathering info from all task and put each partition to resulat_array
MPI_Gather (&(result_array[from]), length*(length/size), MPI_INT, result_array, length*(length/size), MPI_INT, 0, MPI_COMM_WORLD);

做:

//Gathering info from all task and put each partition to resulat_array
MPI_Gather (&(result_array[from][0]), length*(length/size), MPI_INT, &(result_array[0][0]), length*(length/size), MPI_INT, 0, MPI_COMM_WORLD);