来自递归方法的 IEnumerable 比使用 foreach 构造的相同 IEnumerable 慢 10 倍

IEnumerable from recursive method is 10x slower than same IEnumerable constructed with foreach

我不明白为什么一个 IEnumerable.Contains() 在下面的片段中比另一个更快,即使它们是相同的。

public class Group
{
    public static Dictionary<int, Group> groups = new Dictionary<int, Group>();

    // Members, user and groups
    public List<string> Users = new List<string>();
    public List<int> GroupIds = new List<int>();

    public IEnumerable<string> AggregateUsers()
    {
        IEnumerable<string> aggregatedUsers = Users.AsEnumerable();
        foreach (int id in GroupIds)
            aggregatedUsers = aggregatedUsers.Concat(groups[id].AggregateUsers());
        return aggregatedUsers;
    }
}

static void Main(string[] args)
{
    for (int i = 0; i < 1000; i++)
        Group.groups.TryAdd(i, new Group());

    for (int i = 0; i < 999; i++)
        Group.groups[i + 1].GroupIds.Add(i);

    for (int i = 0; i < 10000; i++)
        Group.groups[i/10].Users.Add($"user{i}");

    IEnumerable<string> users = Group.groups[999].AggregateUsers();

    Stopwatch stopwatch = Stopwatch.StartNew();
    bool contains1 = users.Contains("user0");
    Console.WriteLine($"Search through IEnumerable from recursive function was {contains1} and took {stopwatch.ElapsedMilliseconds} ms");

    users = Enumerable.Empty<string>();
    foreach (Group group in Group.groups.Values.Reverse())
        users = users.Concat(group.Users);

    stopwatch = Stopwatch.StartNew();
    bool contains2 = users.Contains("user0");
    Console.WriteLine($"Search through IEnumerable from foreach was {contains2} and took {stopwatch.ElapsedMilliseconds} ms");

    Console.Read();
}

这是执行此代码段获得的输出:

Search through IEnumerable from recursive function was True and took 40 ms
Search through IEnumerable from foreach was True and took 3 ms

该代码段模拟 10,000 个用户分布在 1,000 个组中,每个组 10 个用户。

每个组可以有 2 种类型的成员,用户(字符串)或其他组(表示该组 ID 的 int)。

每个组都有上一个组作为成员。所以第 0 组有 10 个用户,第 1 组有 10 个用户和来自第 0 组的用户,第 2 组有 10 个用户和第 1 组的用户..递归开始。

搜索的目的是确定用户 "user0"(接近列表末尾)是否是组 999(通过组关系包含所有 10,000 个用户)的成员。

问题来了,为什么用foreach构造的IEnumerable只用了3毫秒,而用递归方式构造的同一个IEnumerable却多了10倍?

一个有趣的问题。当我在 .NET Framework 中编译它时,执行时间大致相同(我不得不将 TryAdd Dictionary 方法更改为 Add)。

在 .NET Core 中,我得到了与您观察到的相同的结果。

我相信答案是延期执行。您可以在调试器中看到

IEnumerable<string> users = Group.groups[999].AggregateUsers();

分配给 users 变量将导致 Concat2Iterator 实例和第二个实例

users = Enumerable.Empty<string>();
foreach (Group group in Group.groups.Values.Reverse())
    users = users.Concat(group.Users);

将生成 ConcatNIterator。

来自 concat 的文档:

This method is implemented by using deferred execution. The immediate return value is an object that stores all the information that is required to perform the action. The query represented by this method is not executed until the object is enumerated either by calling its GetEnumerator method directly or by using foreach in Visual C# or For Each in Visual Basic.

你可以看看concat的代码here。 ConcatNIterator 和 Concat2Iterator 的 GetEnumerable 实现不同

所以我的猜测是,由于您使用 concat 构建查询的方式,第一个查询需要更长的时间来评估。如果您尝试在这样的可枚举之一上使用 ToList():

IEnumerable<string> users = Group.groups[999].AggregateUsers().ToList();

您会看到经过的时间几乎减少到 0 毫秒。

在阅读了 Mikołaj 的回答和 Servy 的评论后,我想出了解决问题的方法。谢谢!

public class Group
{
    public static Dictionary<int, Group> groups = new Dictionary<int, Group>();

    // Members, user and groups
    public List<string> Users = new List<string>();
    public List<int> GroupIds = new List<int>();

    public IEnumerable<string> AggregateUsers()
    {
        IEnumerable<string> aggregatedUsers = Users.AsEnumerable();
        foreach (int id in GroupIds)
            aggregatedUsers = aggregatedUsers.Concat(groups[id].AggregateUsers());
        return aggregatedUsers;
    }

    public IEnumerable<string> AggregateUsers(List<IEnumerable<string>> aggregatedUsers = null)
    {
        bool topStack = false;
        if (aggregatedUsers == null)
        {
            topStack = true;
            aggregatedUsers = new List<IEnumerable<string>>();
        }
        aggregatedUsers.Add(Users.AsEnumerable());
        foreach (int id in GroupIds)
            groups[id].AggregateUsers(aggregatedUsers);

        if (topStack)
            return aggregatedUsers.SelectMany(i => i);
        else
            return null;
    }
}

static void Main(string[] args)
{
    for (int i = 0; i < 1000; i++)
        Group.groups.TryAdd(i, new Group());

    for (int i = 0; i < 999; i++)
        Group.groups[i + 1].GroupIds.Add(i);

    for (int i = 0; i < 10000; i++)
        Group.groups[i / 10].Users.Add($"user{i}");

    Stopwatch stopwatch = Stopwatch.StartNew();
    IEnumerable<string> users = Group.groups[999].AggregateUsers();
    Console.WriteLine($"Aggregation via nested concatenation took {stopwatch.ElapsedMilliseconds} ms");

    stopwatch = Stopwatch.StartNew();
    bool contains = users.Contains("user0");
    Console.WriteLine($"Search through IEnumerable from nested concatenation was {contains} and took {stopwatch.ElapsedMilliseconds} ms");

    stopwatch = Stopwatch.StartNew();
    users = Group.groups[999].AggregateUsers(null);
    Console.WriteLine($"Aggregation via SelectMany took {stopwatch.ElapsedMilliseconds} ms");

    stopwatch = Stopwatch.StartNew();
    contains = users.Contains("user0");
    Console.WriteLine($"Search through IEnumerable from SelectMany was {contains} and took {stopwatch.ElapsedMilliseconds} ms");

    stopwatch = Stopwatch.StartNew();
    users = Enumerable.Empty<string>();
    foreach (Group group in Group.groups.Values.Reverse())
        users = users.Concat(group.Users);
    Console.WriteLine($"Aggregation via flat concatenation took {stopwatch.ElapsedMilliseconds} ms");

    stopwatch = Stopwatch.StartNew();
    contains = users.Contains("user0");
    Console.WriteLine($"Search through IEnumerable from flat concatenation was {contains} and took {stopwatch.ElapsedMilliseconds} ms");

    Console.Read();
}

结果如下:

Aggregation via nested concatenation took 0 ms
Search through IEnumerable from nested concatenation was True and took 43 ms
Aggregation via SelectMany took 1 ms
Search through IEnumerable from SelectMany was True and took 0 ms
Aggregation via foreach concatenation took 0 ms
Search through IEnumerable from foreach concatenation was True and took 2 ms