从 ParallelQuery 聚合自定义数据
Aggregate custom data from ParallelQuery
我正在从一些文件中解析大量单词(数百万)并按语言对它们进行计数。我使用 PLINQ 是因为性能,但我认为(通过观察任务管理器)整个过程是按顺序进行的。可能是被我的聚合函数屏蔽了。
这可能吗?
这里有罪PLINQ
ParallelQuery<string> query = Directory.EnumerateFiles(test, "*.d", SearchOption.AllDirectories).AsParallel();
query = query.SelectMany(parseStrings).Where(isValidPhrase);
query = query.SelectMany(s => Regex.Matches(s, @"\w+").Cast<Match>().Select(match => match.Value));
Result output = query.Aggregate(new Result(), (result, word) =>
{
if (word.All(russianAlfabet.Contains))
result.Ru++;
else if (czechWords.Contains(word))
result.Cs++;
else
result.Other++;
return result;
});
...这是聚合结果class
class Result {
public int Ru { get; set; }
public int Cs { get; set; }
public int Other { get; set; }
}
为 ParallelEnumerable.Aggregate
尝试这个重载
public static TResult Aggregate<TSource, TAccumulate, TResult>(
this ParallelQuery<TSource> source,
TAccumulate seed,
Func<TAccumulate, TSource, TAccumulate> updateAccumulatorFunc,
Func<TAccumulate, TAccumulate, TAccumulate> combineAccumulatorsFunc,
Func<TAccumulate, TResult> resultSelector
)
我正在从一些文件中解析大量单词(数百万)并按语言对它们进行计数。我使用 PLINQ 是因为性能,但我认为(通过观察任务管理器)整个过程是按顺序进行的。可能是被我的聚合函数屏蔽了。
这可能吗?
这里有罪PLINQ
ParallelQuery<string> query = Directory.EnumerateFiles(test, "*.d", SearchOption.AllDirectories).AsParallel();
query = query.SelectMany(parseStrings).Where(isValidPhrase);
query = query.SelectMany(s => Regex.Matches(s, @"\w+").Cast<Match>().Select(match => match.Value));
Result output = query.Aggregate(new Result(), (result, word) =>
{
if (word.All(russianAlfabet.Contains))
result.Ru++;
else if (czechWords.Contains(word))
result.Cs++;
else
result.Other++;
return result;
});
...这是聚合结果class
class Result {
public int Ru { get; set; }
public int Cs { get; set; }
public int Other { get; set; }
}
为 ParallelEnumerable.Aggregate
尝试这个重载public static TResult Aggregate<TSource, TAccumulate, TResult>(
this ParallelQuery<TSource> source,
TAccumulate seed,
Func<TAccumulate, TSource, TAccumulate> updateAccumulatorFunc,
Func<TAccumulate, TAccumulate, TAccumulate> combineAccumulatorsFunc,
Func<TAccumulate, TResult> resultSelector
)