ArrayList 容量大小增加的奇怪行为
ArrayList capacity size increasing strange behaviour
当 ArrayList 想要存储比实际容量更多的元素时,它会增加容量。这是非常具有成本效益的操作,因为我们实际上将所有数据从以前的 ArrayList 复制到容量更大的新 ArrayList。但是我想知道,当 ArrayList 只需要更多 space 时,可能不会进行某些具有容量的操作 - 但更早。我想知道我的输出 "slow indexes" 需要这么长时间,而增加容量是我唯一的想法。这是我的代码:
import java.lang.reflect.Field;
import java.util.ArrayList;
import java.util.List;
public class MainArr {
ArrayList<Integer> normalList = new ArrayList<Integer>();
public static void main(String[] args) throws Exception {
MainArr m = new MainArr();
m.addElements();
}
public void addElements() throws Exception {
long startTime = System.currentTimeMillis();
for (int j = 0; j < 20000000; j++) {
if (j % 500000 == 0) {
System.out.println("j:" + j + " capacity:" + getCapacity(this.normalList));
}
long addTime = System.currentTimeMillis();
this.normalList.add(j);
if (System.currentTimeMillis() - addTime > 50) {
System.out.println("slow index-" + j + " - time:" + (System.currentTimeMillis() - addTime));
}
}
System.out.println("End after:" + (System.currentTimeMillis() - startTime));
}
int getCapacity(List al) throws Exception {
Field field = ArrayList.class.getDeclaredField("elementData");
field.setAccessible(true);
return ((Object[]) field.get(al)).length;
}
}
输出:
j:0 capacity:0
j:500000 capacity:540217
j:1000000 capacity:1215487
j:1500000 capacity:1823230
j:2000000 capacity:2734845
j:2500000 capacity:2734845
j:3000000 capacity:4102267
j:3500000 capacity:4102267
j:4000000 capacity:4102267
slow index-4102267 - time:1203 //We need more space in ArrayList.That's why it takes some time.
j:4500000 capacity:6153400
j:5000000 capacity:6153400
j:5500000 capacity:6153400
j:6000000 capacity:6153400
j:6500000 capacity:9230100
slow index-6758010 - time:1477 //We dont need to increase capacity. But we stop for a moment...
j:7000000 capacity:9230100 //... and we have the same capacity
j:7500000 capacity:9230100
j:8000000 capacity:9230100
j:8500000 capacity:9230100
j:9000000 capacity:9230100
j:9500000 capacity:13845150 // Somehow capacity is increased insanely fast
j:10000000 capacity:13845150
j:10500000 capacity:13845150
j:11000000 capacity:13845150
j:11500000 capacity:13845150
j:12000000 capacity:13845150
slow index-12426474 - time:3168 //We dont need to increase capacity. But we stop for a moment...
j:12500000 capacity:13845150 //... and we have the same capacity
j:13000000 capacity:13845150
j:13500000 capacity:13845150
j:14000000 capacity:20767725 // Somehow capacity is increased insanely fast
j:14500000 capacity:20767725
slow index-14639924 - time:144
j:15000000 capacity:20767725
j:15500000 capacity:20767725
j:16000000 capacity:20767725
j:16500000 capacity:20767725
j:17000000 capacity:20767725
j:17500000 capacity:20767725
j:18000000 capacity:20767725
j:18500000 capacity:20767725
j:19000000 capacity:20767725
j:19500000 capacity:20767725
slow index-19980735 - time:218
End after:6990
每次调用add
函数时,它都会调用ensureCapacity
函数,size+1
作为minCapacity
参数(列表的大小,不是列表后面的数组)。
您可以在下面查看ensureCapacity
的代码:
public void ensureCapacity(int minCapacity) {
modCount++;
int oldCapacity = elementData.length;
if (minCapacity > oldCapacity) {
Object oldData[] = elementData;
int newCapacity = (oldCapacity * 3)/2 + 1;
if (newCapacity < minCapacity)
newCapacity = minCapacity;
// minCapacity is usually close to size, so this is a win:
elementData = Arrays.copyOf(elementData, newCapacity);
}
}
请注意,只有当 minCapacity
参数大于数组的当前大小时,它才会创建一个新数组。
编辑(谢谢@Jyotsana Nandwani):
In JDK 1.7, new way to calculate
resize is :
int newCapacity = oldCapacity + (oldCapacity >> 1)
where right shift operator makes sure to increase the capacity by 50% of old capacity, i.e. 1.5 times
ArrayList
代码经过优化,容量从 10 开始,每次需要更多 space 时增加 1.5 倍。
您可以使用您的程序的修改版本检测精确的增长点:
public void addElements() throws Exception {
int lastCap = -1;
for (int j = 0; j < 1000000; j++) {
this.normalList.add(j);
int cap = getCapacity(this.normalList);
if (cap != lastCap) {
System.out.println("size:" + normalList.size() + " capacity:" + cap);
lastCap = cap;
}
}
}
int getCapacity(List al) throws Exception {
Field field = ArrayList.class.getDeclaredField("elementData");
field.setAccessible(true);
return ((Object[]) field.get(al)).length;
}
这prints以下号码:
size:1 capacity:10
size:11 capacity:15
size:16 capacity:22
size:23 capacity:33
size:34 capacity:49
size:50 capacity:73
size:74 capacity:109
... // And so on
source code responsible for growing the list在ensureCapacity
方法中,如下:
int newCapacity = (oldCapacity * 3)/2 + 1;
这相当于整数乘以 1.5。
为了提高性能,尝试在 ArrayList
实例化上定义大容量。例如:
List<Users> users = new ArrayList<>(100000);
users.add(new User("John", "Doe"));
只有在您事先知道所需容量的情况下,这才会对您有所帮助`。如果您不知道将有多少个实例,请考虑使用其他数据结构。
例如,查看 Queue
接口的实现,特别是 LinkedList
的实现。这种数据结构有一个恒定的添加新元素的时间,但当元素位于列表中间时,不利于通过索引获取元素。请注意,LinkedList
还实现了 List
接口以及 ArrayList
,因此以下语法是有效的:
List<Users> users = new LinkedList<>();
users.add(new User("John", "Doe"));
当 ArrayList 想要存储比实际容量更多的元素时,它会增加容量。这是非常具有成本效益的操作,因为我们实际上将所有数据从以前的 ArrayList 复制到容量更大的新 ArrayList。但是我想知道,当 ArrayList 只需要更多 space 时,可能不会进行某些具有容量的操作 - 但更早。我想知道我的输出 "slow indexes" 需要这么长时间,而增加容量是我唯一的想法。这是我的代码:
import java.lang.reflect.Field;
import java.util.ArrayList;
import java.util.List;
public class MainArr {
ArrayList<Integer> normalList = new ArrayList<Integer>();
public static void main(String[] args) throws Exception {
MainArr m = new MainArr();
m.addElements();
}
public void addElements() throws Exception {
long startTime = System.currentTimeMillis();
for (int j = 0; j < 20000000; j++) {
if (j % 500000 == 0) {
System.out.println("j:" + j + " capacity:" + getCapacity(this.normalList));
}
long addTime = System.currentTimeMillis();
this.normalList.add(j);
if (System.currentTimeMillis() - addTime > 50) {
System.out.println("slow index-" + j + " - time:" + (System.currentTimeMillis() - addTime));
}
}
System.out.println("End after:" + (System.currentTimeMillis() - startTime));
}
int getCapacity(List al) throws Exception {
Field field = ArrayList.class.getDeclaredField("elementData");
field.setAccessible(true);
return ((Object[]) field.get(al)).length;
}
}
输出:
j:0 capacity:0
j:500000 capacity:540217
j:1000000 capacity:1215487
j:1500000 capacity:1823230
j:2000000 capacity:2734845
j:2500000 capacity:2734845
j:3000000 capacity:4102267
j:3500000 capacity:4102267
j:4000000 capacity:4102267
slow index-4102267 - time:1203 //We need more space in ArrayList.That's why it takes some time.
j:4500000 capacity:6153400
j:5000000 capacity:6153400
j:5500000 capacity:6153400
j:6000000 capacity:6153400
j:6500000 capacity:9230100
slow index-6758010 - time:1477 //We dont need to increase capacity. But we stop for a moment...
j:7000000 capacity:9230100 //... and we have the same capacity
j:7500000 capacity:9230100
j:8000000 capacity:9230100
j:8500000 capacity:9230100
j:9000000 capacity:9230100
j:9500000 capacity:13845150 // Somehow capacity is increased insanely fast
j:10000000 capacity:13845150
j:10500000 capacity:13845150
j:11000000 capacity:13845150
j:11500000 capacity:13845150
j:12000000 capacity:13845150
slow index-12426474 - time:3168 //We dont need to increase capacity. But we stop for a moment...
j:12500000 capacity:13845150 //... and we have the same capacity
j:13000000 capacity:13845150
j:13500000 capacity:13845150
j:14000000 capacity:20767725 // Somehow capacity is increased insanely fast
j:14500000 capacity:20767725
slow index-14639924 - time:144
j:15000000 capacity:20767725
j:15500000 capacity:20767725
j:16000000 capacity:20767725
j:16500000 capacity:20767725
j:17000000 capacity:20767725
j:17500000 capacity:20767725
j:18000000 capacity:20767725
j:18500000 capacity:20767725
j:19000000 capacity:20767725
j:19500000 capacity:20767725
slow index-19980735 - time:218
End after:6990
每次调用add
函数时,它都会调用ensureCapacity
函数,size+1
作为minCapacity
参数(列表的大小,不是列表后面的数组)。
您可以在下面查看ensureCapacity
的代码:
public void ensureCapacity(int minCapacity) {
modCount++;
int oldCapacity = elementData.length;
if (minCapacity > oldCapacity) {
Object oldData[] = elementData;
int newCapacity = (oldCapacity * 3)/2 + 1;
if (newCapacity < minCapacity)
newCapacity = minCapacity;
// minCapacity is usually close to size, so this is a win:
elementData = Arrays.copyOf(elementData, newCapacity);
}
}
请注意,只有当 minCapacity
参数大于数组的当前大小时,它才会创建一个新数组。
编辑(谢谢@Jyotsana Nandwani):
In JDK 1.7, new way to calculate resize is :
int newCapacity = oldCapacity + (oldCapacity >> 1)
where right shift operator makes sure to increase the capacity by 50% of old capacity, i.e. 1.5 times
ArrayList
代码经过优化,容量从 10 开始,每次需要更多 space 时增加 1.5 倍。
您可以使用您的程序的修改版本检测精确的增长点:
public void addElements() throws Exception {
int lastCap = -1;
for (int j = 0; j < 1000000; j++) {
this.normalList.add(j);
int cap = getCapacity(this.normalList);
if (cap != lastCap) {
System.out.println("size:" + normalList.size() + " capacity:" + cap);
lastCap = cap;
}
}
}
int getCapacity(List al) throws Exception {
Field field = ArrayList.class.getDeclaredField("elementData");
field.setAccessible(true);
return ((Object[]) field.get(al)).length;
}
这prints以下号码:
size:1 capacity:10
size:11 capacity:15
size:16 capacity:22
size:23 capacity:33
size:34 capacity:49
size:50 capacity:73
size:74 capacity:109
... // And so on
source code responsible for growing the list在ensureCapacity
方法中,如下:
int newCapacity = (oldCapacity * 3)/2 + 1;
这相当于整数乘以 1.5。
为了提高性能,尝试在 ArrayList
实例化上定义大容量。例如:
List<Users> users = new ArrayList<>(100000);
users.add(new User("John", "Doe"));
只有在您事先知道所需容量的情况下,这才会对您有所帮助`。如果您不知道将有多少个实例,请考虑使用其他数据结构。
例如,查看 Queue
接口的实现,特别是 LinkedList
的实现。这种数据结构有一个恒定的添加新元素的时间,但当元素位于列表中间时,不利于通过索引获取元素。请注意,LinkedList
还实现了 List
接口以及 ArrayList
,因此以下语法是有效的:
List<Users> users = new LinkedList<>();
users.add(new User("John", "Doe"));