并行 for 循环,有条件地更新在循环外声明的对象

Parallel for loop with conditional update of object declared outside loop

我是 C# 并行编程的新手。我非常感谢有关如何使以下串行 for 循环片段并行的帮助。

我查看了有关此主题的其他问题,但它们要么更新简单类型,要么更新数组。另外,我发现的问题没有对for循环外的对象进行条件更新。

请特别注意,与类似问题的答案相反,我希望根据 属性 从列表中找到某个 OBJECT(不仅仅是 double 或 int)也需要 calculated/updated 在循环内,然后,基于 属性 ('fitness') 我需要决定是否更新循环外的对象。

这是遗传算法:

fitnessSum = 0;
DNA<T> best = Population[0];

//loop over the population
for (int i = 0; i < Population.Count; i++)
{
     //Get the fitness of the current candidate
     double fitnessThisPopulation = Population[i].CalculateFitness(i);                
     fitnessSum += fitnessThisPopulation;

     // If the current candidate has a better fitness, then update best fitness
     if (Population[i].fitness > best.fitness)
     {
         best = Population[i];
     }
}

BestFitness = best.fitness;

非常感谢任何帮助。作为一个前 FORTRAN 人,我不是 Linq 专家,所以我更愿意使用传统的 for 循环风格编写更多代码。非常感谢!

如果 CalculateFitness 很复杂并且迭代次数很大,那么任何分区策略都可能会给您带来性能提升。但是,如果它微不足道,则很难获得显着收益,因为启动任务和线程的成本将超过数组的实际 JIT 优化迭代。

所以这个答案不是关于什么更快,谁知道呢,它取决于您的系统、内核、CPU、数据、框架、计算以及星期几....然而,它向您展示了如何使用 BenchmarkDotNet 自行测试。我已经包含了一个 PLinq(并行分区)计算以供比较:

给定

public class DNA
{
   public double fitness;

   public double CalculateFitness(int i)
   {
      // some weird complex sum
      var result = fitness;
      for (int j = 0; j < 1000; j++)
         result += result-i;
      return result;
   }
}

基准代码

[SimpleJob(RuntimeMoniker.Net50)]
[MemoryDiagnoser]
public class Test
{
   private DNA[] _population;
   [Params(100000, 1000000)] public int N;

   [GlobalSetup]
   public void Setup()
   {
      var r = new Random(42);
      _population = Enumerable
         .Range(0, N)
         .Select(x => new DNA
         {
            fitness = r.Next(0, 10000)
         }).ToArray();
   }
      
   [Benchmark]
   public void New()
   {
      var fitnessSum = _population
         .AsParallel()
         .Select((x, i) => x.CalculateFitness(i))
         .Sum();

      var best = _population
         .AsParallel()
         .Max(x => x.fitness);
   }

   [Benchmark]
   public void New2()
   {
      var fitnessSum = _population
         .AsParallel()
         .Select((x, i) => x.CalculateFitness(i))
         .Sum();

      var best = _population
         .Max(x => x.fitness);
   }
      
   [Benchmark]
   public void New3()
   {
       var best = _population[0].fitness;

      var fitnessSum = _population
         .AsParallel()
         .Select((x, i) =>
         {
            AssignIfNewValueGreater(ref best, _population[i].fitness);
            return x.CalculateFitness(i);
         })
         .Sum();

   }

   [MethodImpl(MethodImplOptions.AggressiveInlining)]
   public static void AssignIfNewValueGreater(ref double target, double newValue)
   {
      // be careful, this method is only good for 64 operating systems
      double temp;
      do
      {
         temp = target;
      } while (newValue > temp && Interlocked.CompareExchange(ref target, newValue, temp) != temp);
   }

   [Benchmark]
   public void Old()
   {
      double fitnessSum = 0;
      var best = _population[0];

      //loop over the population
      for (var i = 0; i < _population.Length; i++)
      {
         //Get the fitness of the current candidate
         var fitnessThisPopulation = _population[i].CalculateFitness(i);
         fitnessSum += fitnessThisPopulation;

         //If the current candidate has a better fitness, then update best fitness
         if (_population[i].fitness > best.fitness) best = _population[i];
      }

      var bestFitness = best.fitness;
   }
}

用法

BenchmarkRunner.Run<Test>();

结果

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1288 (21H1/May2021Update)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK=6.0.100-rc.1.21463.6
  [Host]   : .NET 5.0.11 (5.0.1121.47308), X64 RyuJIT  [AttachedDebugger]
  .NET 5.0 : .NET 5.0.11 (5.0.1121.47308), X64 RyuJIT

Job=.NET 5.0  Runtime=.NET 5.0
Method N Mean Error StdDev Allocated
New 100000 8.082 ms 0.1616 ms 0.3582 ms 27,129 B
New2 100000 8.205 ms 0.1610 ms 0.2458 ms 13,008 B
New3 100000 7.725 ms 0.1510 ms 0.2439 ms 13,072 B
Old 100000 148.285 ms 0.0301 ms 0.0281 ms -
New 1000000 75.982 ms 1.5130 ms 2.4432 ms 27,160 B
New2 1000000 76.613 ms 1.5147 ms 2.9900 ms 13,008 B
New3 1000000 71.590 ms 1.4137 ms 3.1619 ms 13,072 B
Old 1000000 1,483.186 ms 0.5073 ms 0.4236 ms 704 B

免责声明:显然,您可以尝试许多其他并行策略(基于 linq 或其他),这些策略值得进行基准测试, 这只是一个例子

补充 :在投入更多内核和线程解决问题之前,首先您需要确保存在性能问题,并且如果你确定你首先尽可能有效地解决了这个问题。使用更多线程使低效算法更快是没有意义的