收集所有由矩阵的 mpi 行计算到根矩阵

Collect all computed by mpi rows of matrix to root matrix

我整天都在努力借助 MPI 实现矩阵乘法,网上的所有例子都不适合我(我不知道为什么,它编译了,运行 但没有计算)。这是我正在做的事情:

来自bash:

mpirun -n 2 out/lb8

它读取矩阵 2x4(每个进程 1 行)并开始计算。 问题出在 SendRecv 块中(或通常在收集数据中)

void Matrix_MPY(double **matrix_a, double **matrix_b, double ***matrix_c, int a_rows, int a_cols) {
    int i, j;
    int process_rank, process_count;
    MPI_Comm_rank(MPI_COMM_WORLD, &process_rank);
    MPI_Comm_size(MPI_COMM_WORLD, &process_count);

    if (a_rows % process_count != 0) {
        error_code = NOT_DEVIDED_BY_RANK_EXCEPTION;
        return;
    }

    int rows_per_process = a_rows / process_count;
    int current_row = rows_per_process * process_rank;

    double **temp;
    temp = (double **) malloc(sizeof(double *) * a_rows);
    for (i = 0; i < a_rows; ++i){
        temp[i] = (double *) malloc(sizeof(double) * a_rows);
    }

    for (i = current_row; i < current_row + rows_per_process; ++i) {
        for (j = 0; j < a_rows; ++j)
        {
            int k;
            for(k = 0; k < a_cols; ++k){
                temp[i][j] += matrix_a[i][k] * matrix_b[k][j];
            }
        }
        MPI_Sendrecv(temp[i], a_rows, MPI_DOUBLE, ROOT, TAG, temp[i], a_rows, MPI_DOUBLE, process_rank, TAG, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    }

    *matrix_c = temp;
}

这个解决方案对我有用

....
        if (process_rank != ROOT)
        MPI_Send(temp[i], a_rows, MPI_DOUBLE, ROOT, i, MPI_COMM_WORLD);
}

if (process_rank == ROOT) {
    for (i = 1; i < process_count; ++i)
    {
        for (j = i * rows_per_process; j < i * rows_per_process + rows_per_process; ++j)
        {
            MPI_Recv(temp[j], a_rows, MPI_DOUBLE, i, j, MPI_COMM_WORLD, MPI_STATUSES_IGNORE);
        }
    }
}

*matrix_c = temp;