MPI 和变量覆盖?

MPI and Variable overwrite?

我尝试 运行 一个用于计算 burmester desmedt 密钥协议的分布式应用程序。

执行此操作的代码如下:

#include "mpi.h"
#include <stdio.h>

// Openssl
#include <openssl/conf.h>
#include <openssl/evp.h>
#include <openssl/err.h>

// Custom
#include "message.h"
#include "dh.h"

void cleanup(DH *secret);

void cleanup(DH *secret) {
  EVP_cleanup();
  CRYPTO_cleanup_all_ex_data();
  OPENSSL_free(secret);
  ERR_free_strings();
}

int main(int argc, char *argv[]) {
  int rank, size;


  MPI_Init( &argc, &argv );
  MPI_Comm_rank( MPI_COMM_WORLD, &rank );
  MPI_Comm_size( MPI_COMM_WORLD, &size );

  /* Load the human readable error strings for libcrypto */
  ERR_load_crypto_strings();
  /* Load all digest and cipher algorithms */
  OpenSSL_add_all_algorithms();
  /* Load config file, and other important initialisation */
  OPENSSL_config(NULL);

  DH *secret;

  printf("RANK %d, Generating Diffie Hellman Keys\n", rank);
  fflush(stdout);
  if(-1 == generateKeys(secret)) {
    cleanup(secret);
    fprintf(stderr, "RANK %d, Failed to intialize key\n",rank);
    fflush(stderr);
    return -1;
  }

  printf("RANK %d, Keys generated\n",rank);
  fflush(stdout);

  // Set up Barrier for cpommunications
  MPI_Barrier(MPI_COMM_WORLD);

  printf("RANK %d, Publishing Keys\n", rank);
  fflush(stdout);

  if(secret == NULL) {
    cleanup(secret);
    fprintf(stderr, "RANK %d, Error on Generating the Diffie Hellman\n",rank);
    fflush(stderr);
    return -1;
  }

  if(MPIbcastBigNum(secret->pub_key, rank, "Publishing Public Key") == -1){
   cleanup(secret);
   return -1;
  }

  /*Cleanup */
  cleanup(secret);
  MPI_Finalize();
  return 0;
}

还有用于初始化 DH 和消息传递的辅助方法(为了保存 space 我 post 通过链接的代码):

我也是通过这个编译的Makefile。所以我想知道为什么我下面的执行会变成段错误?

mpirun -np 3 ./builds/main
RANK 1, Generating Diffie Hellman Keys
RANK 2, Generating Diffie Hellman Keys
RANK 0, Generating Diffie Hellman Keys
RANK 1, Keys generated
RANK 2, Keys generated
RANK 0, Keys generated
RANK 2, Publishing Keys
[pcmagas-System-Product-Name:19304] *** Process received signal ***
[pcmagas-System-Product-Name:19304] Signal: Segmentation fault (11)
[pcmagas-System-Product-Name:19304] Signal code:  (128)
[pcmagas-System-Product-Name:19304] Failing at address: (nil)
[pcmagas-System-Product-Name:19304] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f485819c390]
[pcmagas-System-Product-Name:19304] [ 1] /lib/x86_64-linux-gnu/libcrypto.so.1.0.0(BN_num_bits+0x1d)[0x7f485873acfd]
[pcmagas-System-Product-Name:19304] [ 2] ./builds/main[0x401461]
[pcmagas-System-Product-Name:19304] [ 3] ./builds/main[0x40127b]
[pcmagas-System-Product-Name:19304] [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f4857de1830]
[pcmagas-System-Product-Name:19304] [ 5] ./builds/main[0x400ff9]
[pcmagas-System-Product-Name:19304] *** End of error message ***
RANK 1, Publishing Keys
[pcmagas-System-Product-Name:19303] *** Process received signal ***
[pcmagas-System-Product-Name:19303] Signal: Segmentation fault (11)
[pcmagas-System-Product-Name:19303] Signal code:  (128)
[pcmagas-System-Product-Name:19303] Failing at address: (nil)
[pcmagas-System-Product-Name:19303] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f2e56b90390]
[pcmagas-System-Product-Name:19303] [ 1] /lib/x86_64-linux-gnu/libcrypto.so.1.0.0(BN_num_bits+0x1d)[0x7f2e5712ecfd]
[pcmagas-System-Product-Name:19303] [ 2] ./builds/main[0x401461]
[pcmagas-System-Product-Name:19303] [ 3] ./builds/main[0x40127b]
[pcmagas-System-Product-Name:19303] [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f2e567d5830]
[pcmagas-System-Product-Name:19303] [ 5] ./builds/main[0x400ff9]
[pcmagas-System-Product-Name:19303] *** End of error message ***
RANK 0, Publishing Keys
[pcmagas-System-Product-Name:19302] *** Process received signal ***
[pcmagas-System-Product-Name:19302] Signal: Segmentation fault (11)
[pcmagas-System-Product-Name:19302] Signal code:  (128)
[pcmagas-System-Product-Name:19302] Failing at address: (nil)
[pcmagas-System-Product-Name:19302] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fd967a55390]
[pcmagas-System-Product-Name:19302] [ 1] /lib/x86_64-linux-gnu/libcrypto.so.1.0.0(BN_num_bits+0x1d)[0x7fd967ff3cfd]
[pcmagas-System-Product-Name:19302] [ 2] ./builds/main[0x401461]
[pcmagas-System-Product-Name:19302] [ 3] ./builds/main[0x40127b]
[pcmagas-System-Product-Name:19302] [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fd96769a830]
[pcmagas-System-Product-Name:19302] [ 5] ./builds/main[0x400ff9]
[pcmagas-System-Product-Name:19302] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 19304 on node pcmagas-System-Product-Name exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
Makefile:16: recipe for target 'run' failed
make: *** [run] Error 139

DH 对象是否以某种方式被覆盖(据我所知,OpenSSL 不是线程和进程安全的,我试图通过单独的进程 运行 它)。

有时在任何函数(最好是在 main)上创建 DH 对象是个好主意。

在您的实现中,是在位于 dh.c 文件中的以下函数中创建的:

int generateKeys(DH *encryptionInfo) {
 int codes;

 if(NULL == (encryptionInfo = DH_new())) return -1;
 if(1 != DH_generate_parameters_ex(encryptionInfo, 2048, DH_GENERATOR_2, NULL)) return -1;
 if(1 != DH_check(encryptionInfo, &codes)) return -1;
 if(codes != 0) return -1;
 if(1 != DH_generate_key(encryptionInfo)) return -1;

 return 0;
}

所以可能会导致encryptionInfo获取无效地址,所以我建议将下面的代码移到main:

 if(NULL == (encryptionInfo = DH_new())) return -1;