将克隆添加到列表很慢
Adding clones to a list is slow
我正在尝试加速循环,该循环正在克隆 2_500_000 对象。
克隆本身在整个循环中花费了 800 毫秒,但是当我将它们添加到列表中时,它花费了 3 秒..
List<T> list = new List<T>();
Stopwatch sw = new Stopwatch();
sw.Start();
foreach(T entity in listSource)
{
T entityCloned = GetEntityClone(entity); // Taking 800ms totally
if (entityCloned != null)
list.Add(entityCloned);
}
sw.Stop();
你能帮我找出为什么那些 Add
花了这么多时间吗?
不幸的是,遍历很多东西和深度复制对象需要时间。我认为 3 秒不一定是不合理的时间。
但是您可以提高速度。
首先,如果您事先知道结果列表需要包含多少项,您就可以set the internal capacity beforehand to prevent the list from having to resize. Resizing is an expensive activity which can be avoided if necessary. This can be done by manually changing the capacity property of the list or by passing the capacity as a constructor argument for the list。
分配容量后,将需要 the complexity of adding to the list should be O(1)
, no reaollcations (which are a O(n)
complexity task )。在这种情况下,添加到列表中不太可能成为瓶颈。
您还可以从预先复制的初始列表中删除空值,以消除对每次都必须评估的 if 语句的需要。使用 linq:
var noNulls = listSource.where(o => o != null)
我通过使用数组而不是列表节省了一些时间(大约 33%):
MyObject class 定义:
public class MyObject
{
public int Id { get; set; }
public bool Flag { get; set; }
public static MyObject GetEntityClone(MyObject obj)
{
if (obj == null) return null;
var newObj = new MyObject()
{
Id = obj.Id,
Flag = obj.Flag
};
return newObj;
}
}
代码:
var sourceList = new List<MyObject>();
// let's mock the source data, every 27th element will be null
for (int i = 0; i < 2500000; ++i)
{
if (i % 27 != 0)
sourceList.Add(new MyObject { Id = i, Flag = (i % 2 == 0) });
}
var destArray = new MyObject[2500000];
Stopwatch sw = new Stopwatch();
sw.Start();
Console.WriteLine(sw.ElapsedMilliseconds);
var currentElement = 0;
for (int i = 0; i < sourceList.Count; ++i)
{
MyObject entityCloned = MyObject.GetEntityClone(sourceList[i]);
if (entityCloned != null)
destArray[currentElement++] = entityCloned;
}
var result = new MyObject[currentElement];
Array.Copy(destArray, 0, result, 0, currentElement);
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
尝试以下并行操作:
ConcurrentBag<T> list = new ConcurrentBag<T>();
Parallel.ForEach(listSource, entity =>
{
T entityCloned = GetEntityClone(entity); //Taking 800ms totally
if (entityCloned != null)
list.Add(entityCloned);
});
var listVersion = list.ToList();
我正在尝试加速循环,该循环正在克隆 2_500_000 对象。 克隆本身在整个循环中花费了 800 毫秒,但是当我将它们添加到列表中时,它花费了 3 秒..
List<T> list = new List<T>();
Stopwatch sw = new Stopwatch();
sw.Start();
foreach(T entity in listSource)
{
T entityCloned = GetEntityClone(entity); // Taking 800ms totally
if (entityCloned != null)
list.Add(entityCloned);
}
sw.Stop();
你能帮我找出为什么那些 Add
花了这么多时间吗?
不幸的是,遍历很多东西和深度复制对象需要时间。我认为 3 秒不一定是不合理的时间。
但是您可以提高速度。
首先,如果您事先知道结果列表需要包含多少项,您就可以set the internal capacity beforehand to prevent the list from having to resize. Resizing is an expensive activity which can be avoided if necessary. This can be done by manually changing the capacity property of the list or by passing the capacity as a constructor argument for the list。
分配容量后,将需要 the complexity of adding to the list should be O(1)
, no reaollcations (which are a O(n)
complexity task
您还可以从预先复制的初始列表中删除空值,以消除对每次都必须评估的 if 语句的需要。使用 linq:
var noNulls = listSource.where(o => o != null)
我通过使用数组而不是列表节省了一些时间(大约 33%):
MyObject class 定义:
public class MyObject
{
public int Id { get; set; }
public bool Flag { get; set; }
public static MyObject GetEntityClone(MyObject obj)
{
if (obj == null) return null;
var newObj = new MyObject()
{
Id = obj.Id,
Flag = obj.Flag
};
return newObj;
}
}
代码:
var sourceList = new List<MyObject>();
// let's mock the source data, every 27th element will be null
for (int i = 0; i < 2500000; ++i)
{
if (i % 27 != 0)
sourceList.Add(new MyObject { Id = i, Flag = (i % 2 == 0) });
}
var destArray = new MyObject[2500000];
Stopwatch sw = new Stopwatch();
sw.Start();
Console.WriteLine(sw.ElapsedMilliseconds);
var currentElement = 0;
for (int i = 0; i < sourceList.Count; ++i)
{
MyObject entityCloned = MyObject.GetEntityClone(sourceList[i]);
if (entityCloned != null)
destArray[currentElement++] = entityCloned;
}
var result = new MyObject[currentElement];
Array.Copy(destArray, 0, result, 0, currentElement);
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
尝试以下并行操作:
ConcurrentBag<T> list = new ConcurrentBag<T>();
Parallel.ForEach(listSource, entity =>
{
T entityCloned = GetEntityClone(entity); //Taking 800ms totally
if (entityCloned != null)
list.Add(entityCloned);
});
var listVersion = list.ToList();