Linq 计数与 IList 计数

Question

如果我有以下来自某个存储库的 IEnumerable 列表。

IEnumerable<SomeObject> items = _someRepo.GetAll();

什么更快：

items.Count(); // Using Linq on the IEnumerable interface.

或

List<SomeObject> temp = items.ToList<SomeObject>(); // Cast as a List

temp.Count(); // Do a count on a list

Linq Count() 比将 IEnumerable 转换为 List 然后执行 Count() 更快还是更慢？

更新：将问题稍微改进为更现实的场景。

Answer 1

任何一个版本都需要（在一般情况下）您完全迭代 IEnumerable<string>。

在某些情况下，支持类型提供了一种机制来直接确定可用于 O(1) 性能的计数。有关详细信息，请参阅@Marcin 的回答。

您调用 ToList() 的版本将有一个额外的 CPU 开销，尽管非常小并且可能难以测量。它还将分配原本不会分配的内存。如果您的计数很高，那将是更大的问题。

Answer 2

直接调用Count是更好的选择。

Enumerable.Count 内置了一些性能改进，可以让它 return 无需枚举整个集合：

public static int Count<TSource>(this IEnumerable<TSource> source) {
    if (source == null) throw Error.ArgumentNull("source");
    ICollection<TSource> collectionoft = source as ICollection<TSource>;
    if (collectionoft != null) return collectionoft.Count;
    ICollection collection = source as ICollection;
    if (collection != null) return collection.Count;
    int count = 0;
    using (IEnumerator<TSource> e = source.GetEnumerator()) {
        checked {
            while (e.MoveNext()) count++;
        }
    }
    return count;
}

ToList() 使用类似的优化，融入 List<T>(IEnumerable<T> source) 构造函数：

public List(IEnumerable<T> collection) {
    if (collection==null)
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.collection);
    Contract.EndContractBlock();

    ICollection<T> c = collection as ICollection<T>;
    if( c != null) {
        int count = c.Count;
        if (count == 0)
        {
            _items = _emptyArray;
        }
        else {
            _items = new T[count];
            c.CopyTo(_items, 0);
            _size = count;
        }
    }    
    else {                
        _size = 0;
        _items = _emptyArray;
        // This enumerable could be empty.  Let Add allocate a new array, if needed.
        // Note it will also go to _defaultCapacity first, not 1, then 2, etc.

        using(IEnumerator<T> en = collection.GetEnumerator()) {
            while(en.MoveNext()) {
                Add(en.Current);                                    
            }
        }
    }
}

但如您所见，它仅使用泛型 ICollection<T>，因此如果您的集合实现 ICollection 而不是其泛型版本，直接调用 Count() 会快得多。

不首先调用 ToList 还可以为您节省新的 List<T> 实例的分配 - 不是太昂贵的东西，但最好尽可能避免不必要的分配。

Answer 3

一个非常基本的 LinqPad 测试表明调用 IEnumerable<string>.Count() 比创建列表 collection 和获取计数更快，更不用说内存效率更高（如其他答案中所述）和更快当重新访问已经枚举的 collection.

我从 IEnumerable 调用 Count() 的平均时间约为 4 次，而创建新列表以获得计数的时间约为 10k。

void Main()
{
    IEnumerable<string> ienumerable = GetStrings();
    var test1 = new Stopwatch();
    test1.Start();
    var count1 = ienumerable.Count();
    test1.Stop();
    test1.ElapsedTicks.Dump();

    var test2 = new Stopwatch();
    test2.Start();
    var count2 = ienumerable.ToList().Count;
    test2.Stop();
    test2.ElapsedTicks.Dump();

    var test3 = new Stopwatch();
    test3.Start();
    var count3 = ienumerable.Count();
    test3.Stop();
    test3.ElapsedTicks.Dump();
}

public IEnumerable<string> GetStrings()
{
    var testString = "test";
    var strings = new List<string>();
    for (int i = 0; i < 500000; i++)
    {
        strings.Add(testString);
    }

    return strings;
}

在后一种情况下，您将招致从现有 collection 创建新 collection 所需的周期（在幕后必须迭代 collection），然后从 collection 中拉出 Count 属性。结果，Enumerable 优化获胜并且 return 计数值更快。

在第三个测试运行中，平均刻度下降到 ~2，因为它立即 returned 之前看到的计数（如下突出显示）。

IColllection<TSource> collectionoft = source as ICollection<TSource>;
if (collectionoft != null) return collectionoft.Count;
ICollection collection = source as ICollection;
if (collection != null) return collection.Count;

然而，这里真正的成本不是CPU周期，而是内存消耗。这才是你更应该关心的。

最后，作为警告，注意不要在 collection 的枚举中使用 Count()。这样做会 re-enumerate 和 collection，导致可能的冲突。如果您需要在迭代 collection 时对某些内容使用计数，正确的方法是使用 .ToList() 创建一个新列表并迭代该列表，引用 Count

Linq 计数与 IList 计数

Linq count vs IList count

c#

linq

performance

ilist