C# 通用列表 <T> 更新项目

Question

我正在使用 List<T>，我需要更新列表中的对象属性。

最 efficient/faster 的方法是什么？我知道随着列表的增长，扫描 List<T> 的索引会变慢，而且 List<T> 不是最有效的更新集合。

那个伤心，不如：

删除匹配对象然后添加新对象？
扫描列表索引直到找到匹配的对象，然后更新对象的属性？
如果我有一个集合，让我们使用 IEnumerable，我想将该 IEnumerable 更新到列表中，最好的方法是什么。

存根代码示例：

public class Product
{
    public int ProductId { get; set; }
    public string ProductName { get; set; }
    public string Category { get; set; }
}

public class ProductRepository
{
    List<Product> product = Product.GetProduct();
    public void UpdateProducts(IEnumerable<Product> updatedProduct)
    {
    }
    public void UpdateProduct(Product updatedProduct)
    {
    }
}

Answer 1

效率到底是什么？

除非确实有成千上万的项目在执行 foreach、for 或任何其他类型的循环操作，否则很可能只会显示毫秒级的差异。真的吗？因此，您浪费了更多的时间（程序员每小时花费 XX 美元的成本，而不是最终用户的成本）试图找到 best.

因此，如果您确实有数千条记录，我建议通过使用 Parallel.Foreach 方法并行处理列表来提高效率，该方法可以处理更多记录以节省线程开销的时间。

恕我直言，如果记录数大于 100，则表示正在使用数据库。如果涉及到数据库，写一个更新存储过程，然后收工；我很难编写一个一次性程序来执行 特定更新 这可以在所述数据库中以更简单的方式完成。

Answer 2

如果您想要快速查找，可以考虑使用字典而不是列表。在您的情况下，它将是产品 ID（我假设它是唯一的）。 Dictionary MSDN

例如：

public class ProductRepository
    {
        Dictionary<int, Product> products = Product.GetProduct();
        public void UpdateProducts(IEnumerable<Product> updatedProducts)
        {
            foreach(var productToUpdate in updatedProducts)
            {
                UpdateProduct(productToUpdate);
            }

            ///update code here...
        }
        public void UpdateProduct(Product productToUpdate)
        {
            // get the product with ID 1234 
            if(products.ContainsKey(productToUpdate.ProductId))
            {
                var product = products[productToUpdate.ProductId];
                ///update code here...
                product.ProductName = productToUpdate.ProductName;
            }
            else
            {
                //add code or throw exception if you want here.
                products.Add(productToUpdate.ProductId, productToUpdate);
            }
        }
    }

Answer 3

Your use case is updating a List<T>, which can contains millions of records, and updated records can be a sub-list or just a single record

架构如下：

public class Product
{
    public int ProductId { get; set; }
    public string ProductName { get; set; }
    public string Category { get; set; }
}

Does Product contains a primary key, which means every Product object can be uniquely identified and there are no duplicates and every update target a single unique record?

如果是，那么最好把List<T>排成Dictionary<int,T>的形式，这样就意味着IEnumerable<T>每次更新都是 O(1) 时间复杂度，这意味着所有更新都可以根据 IEnumerable<T> 的大小完成，我不希望它很大，尽管会有额外的需要不同数据结构的内存分配，但这将是一个非常快速的解决方案。@JamieLupton 已经在类似的线路上提供了解决方案

In case Product is repeated, there's no primary key, then above solution is not valid, then ideal way to scan through the List<T> is Binary Search, whose time complexity is O(logN)

现在由于IEnumerable<T>的大小比较小，比如M，所以整体时间复杂度为O(M*logN)，其中M远小于N，可以忽略不计。

List<T>支持Binary SearchAPI，它提供了元素索引，然后可以用来更新相关索引处的对象，勾选example这里

Best Option as per me for such a high number of records would be parallel processing along with binary search

既然线程安全是一个问题，我通常做的是将一个List<T>分成List<T>[]，这样每个单元都可以分配给一个单独的线程，一个简单的方法是使用MoreLinq 批处理 Api，您可以在其中使用 Environment.ProcessorCount 获取系统处理器的数量，然后创建 IEnumerable<IEnumerable<T>>，如下所示：

var enumerableList = List<T>.Batch(Environment.ProcessorCount).ToList();

另一种方法是遵循自定义代码：

public static class MyExtensions
{
    // data - List<T>
    // dataCount - Calculate once and pass to avoid accessing the property everytime
    // Size of Partition, which can be function of number of processors
    public static List<T>[] SplitList<T>(this List<T> data, int dataCount, int partitionSize)
    {
        int remainderData;    
        var fullPartition = Math.DivRem(dataCount, partitionSize, out remainderData);    
        var listArray = new List<T>[fullPartition];    
        var beginIndex = 0;

        for (var partitionCounter = 0; partitionCounter < fullPartition; partitionCounter++)
        {
            if (partitionCounter == fullPartition - 1)
                listArray[partitionCounter] = data.GetRange(beginIndex, partitionSize + remainderData);
            else
                listArray[partitionCounter] = data.GetRange(beginIndex, partitionSize);    
            beginIndex += partitionSize;
        }    
        return listArray;
    }
}

现在你可以创建Task[]，其中每个Task分配给上面生成的List<T>[]上的每个元素List<T>，然后对每个子分区进行二进制搜索.虽然它是重复的，但会使用并行处理和二进制搜索的功能。每个 Task 都可以启动，然后我们可以使用 Task.WaitAll(taskArray) 等待任务处理完成

除此之外，如果您想创建一个 Dictionary<int,T>[] 并因此使用并行处理，那么这将是最快的。

List<T>[] 到 List<T> 的最终集成可以使用 Linq Aggregation 或 SelectMany 完成，如下所示：

List<T>[] splitListArray = Fetch splitListArray;

// Process  splitListArray

var finalList = splitListArray.SelectMany(obj => obj).ToList()

Another option would be to use Parallel.ForEach along with a thread safe data structure like ConcurrentBag<T> or may be ConcurrentDictionary<int,T> in case you are replacing complete object, but if its property update then a simple List<T> would work. Parallel.ForEach internally use range partitioner similar to what I have suggested above

上述解决方案理想情况下取决于您的用例，您应该能够使用组合来获得最佳结果。让我知道，如果你需要具体的例子

C# 通用列表 <T> 更新项目

C# Generic List<T> update items

c#

generic-list