并行添加两个数组
Adding two arrays in parallel
我对并行编程还很陌生;我正在开发一个处理不同基数的程序,我想并行化加法。
首先,它计算线性卷积,然后进行进位,因此for循环的每次迭代都是独立的(示例):
(Little Endianness btw) [2,7,1] + [8,7,5] = [10,14,6] => [0,5,7]
问题是,如果有可用线程,我可以通过在不同线程中同时完成迭代来加快加法过程吗?如何做?
如果巨大数组(线程有其自身的开销),您可以尝试Parallel
,例如Parallel.For:
int[] left = ...
int[] right = ...
int[] result = new int[left.Length];
...
Parallel.For(0, left.Length, i => result[i] = left[i] + right[i]);
来看看效果:
int N = 100_000_000;
int[] left = new int[N];
int[] right = new int[N];
int[] result = new int[left.Length];
// To prevent garbage collection while testing
GC.Collect(2);
Stopwatch sw = new Stopwatch();
sw.Start();
// Parallel version
//Parallel.For(0, left.Length, i => result[i] = left[i] + right[i]);
// Standard for loop version
for (int i = left.Length - 1; i >= 0; --i)
result[i] = left[i] + right[i];
sw.Stop();
Console.Write(sw.ElapsedMilliseconds);
结果(.Net 6 IA-64,发布版本,Core i9,6 核)
200 - parallel version
500 - for loop version
当您有细粒度的工作要做时,例如将两个整数相加,并且为此目的使用 Parallel.For
method, you'll find that the synchronization overhead, as well as the overhead of invoking a non-inlinable lambda for each index, negates any performance gained by the paralellization. In this case it's a good idea to chunkify the workload by operating on ranges of indices, instead of one index at a time. Here is how you can use the Parallel.ForEach
+Partitioner.Create
方法:
var left = new int[1_000_000];
var right = new int[1_000_000];
var sum = new int[1_000_000];
var parallelOptions = new ParallelOptions
{
MaxDegreeOfParallelism = Environment.ProcessorCount
};
Parallel.ForEach(Partitioner.Create(0, left.Length), parallelOptions, range =>
{
for (int i = range.Item1; i < range.Item2; i++)
{
sum[i] = left[i] + right[i];
}
});
Partitioner.Create
创建的范围大约是 Environment.ProcessorCount
值的三倍,因此在四核机器上它总共会创建大约 12 个范围。这是太多范围(开销)和太少范围(不平衡的工作负载)之间的一个很好的折衷。当然你可以考虑实现你自己的分区方法,并微调每个分区的大小,而不是使用有点死板和过时的Partitioner.Create
方法。
我对并行编程还很陌生;我正在开发一个处理不同基数的程序,我想并行化加法。
首先,它计算线性卷积,然后进行进位,因此for循环的每次迭代都是独立的(示例):
(Little Endianness btw) [2,7,1] + [8,7,5] = [10,14,6] => [0,5,7]
问题是,如果有可用线程,我可以通过在不同线程中同时完成迭代来加快加法过程吗?如何做?
如果巨大数组(线程有其自身的开销),您可以尝试Parallel
,例如Parallel.For:
int[] left = ...
int[] right = ...
int[] result = new int[left.Length];
...
Parallel.For(0, left.Length, i => result[i] = left[i] + right[i]);
来看看效果:
int N = 100_000_000;
int[] left = new int[N];
int[] right = new int[N];
int[] result = new int[left.Length];
// To prevent garbage collection while testing
GC.Collect(2);
Stopwatch sw = new Stopwatch();
sw.Start();
// Parallel version
//Parallel.For(0, left.Length, i => result[i] = left[i] + right[i]);
// Standard for loop version
for (int i = left.Length - 1; i >= 0; --i)
result[i] = left[i] + right[i];
sw.Stop();
Console.Write(sw.ElapsedMilliseconds);
结果(.Net 6 IA-64,发布版本,Core i9,6 核)
200 - parallel version
500 - for loop version
当您有细粒度的工作要做时,例如将两个整数相加,并且为此目的使用 Parallel.For
method, you'll find that the synchronization overhead, as well as the overhead of invoking a non-inlinable lambda for each index, negates any performance gained by the paralellization. In this case it's a good idea to chunkify the workload by operating on ranges of indices, instead of one index at a time. Here is how you can use the Parallel.ForEach
+Partitioner.Create
方法:
var left = new int[1_000_000];
var right = new int[1_000_000];
var sum = new int[1_000_000];
var parallelOptions = new ParallelOptions
{
MaxDegreeOfParallelism = Environment.ProcessorCount
};
Parallel.ForEach(Partitioner.Create(0, left.Length), parallelOptions, range =>
{
for (int i = range.Item1; i < range.Item2; i++)
{
sum[i] = left[i] + right[i];
}
});
Partitioner.Create
创建的范围大约是 Environment.ProcessorCount
值的三倍,因此在四核机器上它总共会创建大约 12 个范围。这是太多范围(开销)和太少范围(不平衡的工作负载)之间的一个很好的折衷。当然你可以考虑实现你自己的分区方法,并微调每个分区的大小,而不是使用有点死板和过时的Partitioner.Create
方法。