Parallel.ForEach return 按输入排序而不是按执行进入 list/dictionary

Parallel.ForEach return order by input not by execution into list/dictionary

如果你有这个:

var resultlist = new List<Dictionary<DateTime, double>>();
Parallel.ForEach(input, item =>
{
    resultlist.Add(SomeDataDictionary(item));
});

return 数据将按照方法 SomeDataDictionary return 数据的顺序排列,而不是按照输入的顺序排列。

有没有办法保持输入的顺序?

或者唯一的方法是更改​​数据类型并使用 Parallel.For 循环,然后将索引传递给某种类型的数组 return 类型?

List<T> 不是 线程安全的 ,这就是为什么 resultlist.Add 在上下文中 不正确 的原因。我建议改用 PLinq

 var resultlist = input
   .AsParallel()
   // .AsOrdered() // uncomment this if you want to preserve input order 
   .Select(item => SomeDataDictionary(item))
   .ToList(); 

解决方案使用 ConcurrentDictionary

您可以使用ConcurrentDictionary因为它是thread-safe并且您可以使用Key来存储订单。

var resultDictionary = new ConcurrentDictionary<double, Dictionary<DateTime, double>>();

// Use For-Loop index as Key
Parallel.ForEach(input, (item, state, index) => {
    resultDictionary.TryAdd(index, SomeDataDictionary(item));
});

// Convert the dictionary to a list in the required order
var resultList = resultDictionary.Keys.OrderBy(k => k).Select(k => resultDictionary[k]).ToList();

ConcurrentDictionary 对比 PLinq 性能

Dmitry Bychenko 在单独的答案中提供了有效的 PLinq 解决方案。

让我们构建一个测试工具来比较解决方案:

class so42112722
{
    private readonly int[] input = Enumerable.Range(1, 5000).ToArray();

    public so42112722()
    {

    }

    public void RunTest()
    {
        var t1 = timeAction(ParallelUsingLoopStateAndDictionary);
        var t2 = timeAction(ParallelUsingPLinq);

        var diff = (t1 - t2);
        var pct = diff / (t1 > t2 ? t2 : t1);

        Console.WriteLine("| {0:0,000.000} | {1:0,000.000} | {2} is {3:0.00%} faster!", t1, t2, (diff > 0 ? "PLinq" : "ConcurrentDictionary"), Math.Abs(pct));
    }


    double timeAction(Action action)
    {
        var name = action.Method.Name;

        var tStart = DateTime.Now;

        action();

        var tEnd = DateTime.Now;
        var duration = (tEnd - tStart).TotalMilliseconds;

        return duration;
    }

    private void ParallelUsingLoopStateAndDictionary()
    {
        var resultDictionary = new ConcurrentDictionary<double, Dictionary<DateTime, double>>();

        Parallel.ForEach(input, (item, state, index) =>
        {
            resultDictionary.TryAdd(index, ExpensiveTransformation(item));
        });

        var resultList = resultDictionary.Keys.OrderBy(k => k).Select(k => resultDictionary[k]).ToList();

    }

    private void ParallelUsingPLinq()
    {
        var reultslist = input
            .AsParallel()
            .AsOrdered()
            .Select(item => ExpensiveTransformation(item))
            .ToList();
    }

    private Dictionary<DateTime, double> ExpensiveTransformation(double item)
    {
        Random rnd = new Random();
        int iterCount = 5000;

        var dict = new Dictionary<DateTime, double>();

        for (int i = 0; i < iterCount; i++)
        {
            DateTime dt = DateTime.Now.AddDays(-i * 3).AddMinutes(i).AddSeconds(item * rnd.Next(100, 1000)).AddMilliseconds(-i);

            var val = Math.Pow(item, rnd.Next(2, 5)) + rnd.Next(100, iterCount) / (i + 1);

            dict.Add(dt, val);
        }

        return dict;
    }

}

现在我们可以使用一个简单的控制台应用程序执行测试:

static void Main(string[] args)
{

    so42112722 test = new so42112722();

    Console.WriteLine("Comparing ConcurrentDictionary to PLinq:");

    for (int i = 0; i < 10; i++)
    {
        test.RunTest();
    }

    Console.ReadLine();
}

结果如下:

Comparing ConcurrentDictionary to PLinq:
| 7,310.756 | 7,597.217 | ConcurrentDictionary is 3.92% faster!
| 7,883.528 | 7,978.108 | ConcurrentDictionary is 1.20% faster!
| 8,075.709 | 8,072.501 | PLinq is 0.04% faster!
| 8,206.721 | 8,193.054 | PLinq is 0.17% faster!
| 8,256.499 | 8,305.187 | ConcurrentDictionary is 0.59% faster!
| 8,424.029 | 8,286.195 | PLinq is 1.66% faster!
| 8,316.973 | 8,261.499 | PLinq is 0.67% faster!
| 8,312.165 | 8,254.285 | PLinq is 0.70% faster!
| 8,328.433 | 8,369.385 | ConcurrentDictionary is 0.49% faster!
| 8,472.054 | 8,344.197 | PLinq is 1.53% faster!

(数字以毫秒为单位。)

此测试是在 quad-core Intel Core i5 CPU 上执行的。您的里程可能会有所不同。

PLinq 在 10 次中快了 6 次,但差别很小。总的来说,基于 10 次测试迭代,ConcurrentDictionary 方法的结果快了惊人的 74.76 毫秒 (0.092%)。看起来很像 United States presidential election, 2016,在那里你可以获得更多的选票,但仍然会输 :).

判决

不要尝试 over-optimise 您的代码。 .Net Framework 随时为您提供帮助。如果 PLinq 将简化您的代码 - 使用它;另一方面,如果您需要更多控制权,请接受它。