并行添加两个数组

Adding two arrays in parallel

我对并行编程还很陌生;我正在开发一个处理不同基数的程序,我想并行化加法。

首先,它计算线性卷积,然后进行进位,因此for循环的每次迭代都是独立的(示例):

(Little Endianness btw) [2,7,1] + [8,7,5] = [10,14,6] => [0,5,7]

问题是,如果有可用线程,我可以通过在不同线程中同时完成迭代来加快加法过程吗?如何做?

如果巨大数组(线程有其自身的开销),您可以尝试Parallel,例如Parallel.For:

  int[] left  = ...
  int[] right = ...
  int[] result = new int[left.Length];

  ...

  Parallel.For(0, left.Length, i => result[i] = left[i] + right[i]);

来看看效果:

  int N = 100_000_000;

  int[] left   = new int[N];
  int[] right  = new int[N];
  int[] result = new int[left.Length];

  // To prevent garbage collection while testing
  GC.Collect(2);

  Stopwatch sw = new Stopwatch();

  sw.Start();

  // Parallel version
  //Parallel.For(0, left.Length, i => result[i] = left[i] + right[i]);

  // Standard for loop version
  for (int i = left.Length - 1; i >= 0; --i)
    result[i] = left[i] + right[i];

  sw.Stop();

  Console.Write(sw.ElapsedMilliseconds);

结果(.Net 6 IA-64,发布版本,Core i9,6 核)

  200 - parallel version
  500 - for loop version

当您有细粒度的工作要做时,例如将两个整数相加,并且为此目的使用 Parallel.For method, you'll find that the synchronization overhead, as well as the overhead of invoking a non-inlinable lambda for each index, negates any performance gained by the paralellization. In this case it's a good idea to chunkify the workload by operating on ranges of indices, instead of one index at a time. Here is how you can use the Parallel.ForEach+Partitioner.Create 方法:

var left = new int[1_000_000];
var right = new int[1_000_000];
var sum = new int[1_000_000];

var parallelOptions = new ParallelOptions
{
    MaxDegreeOfParallelism = Environment.ProcessorCount
};

Parallel.ForEach(Partitioner.Create(0, left.Length), parallelOptions, range =>
{
    for (int i = range.Item1; i < range.Item2; i++)
    {
        sum[i] = left[i] + right[i];
    }
});

Partitioner.Create 创建的范围大约是 Environment.ProcessorCount 值的三倍,因此在四核机器上它总共会创建大约 12 个范围。这是太多范围(开销)和太少范围(不平衡的工作负载)之间的一个很好的折衷。当然你可以考虑实现你自己的分区方法,并微调每个分区的大小,而不是使用有点死板和过时的Partitioner.Create方法。