即使 CPU 数量增加,执行时间也会增加,为什么?

Increased Execution TIme even with increased number of CPUs, Why?

我有 运行 HPC 集群上不同数量的 CPU 的相同 C++ 问题大小,但我发现当 CPU 数量增加时,执行时间也会增加。我期望执行时间会显着减少。任何人都可以阐明这个问题吗?

以下是每个 CPU 的执行时间

  Number of CPUs      Problem size         Time (seconds)
  1                   3000000              15.48
  2                   3000000              18.2
  4                   3000000              21.73
  8                   3000000              40.55
  16                  3000000              60.14
  32                  3000000              98.75

我的想法:

希望这能解释它:

"There are two major factors that influence performance: the speed of the CPUs themselves, and the speed of their access to memory. In a cluster, it’s fairly obvious that a given CPU will have fastest access to the RAM within the same computer (node). Perhaps more surprisingly, similar issues are relevant on a typical multicore laptop, due to differences in the speed of main memory and the cache. Consequently, a good multiprocessing environment should allow control over the “ownership” of a chunk of memory by a particular CPU."