ArrayList 容量大小增加的奇怪行为

ArrayList capacity size increasing strange behaviour

当 ArrayList 想要存储比实际容量更多的元素时,它会增加容量。这是非常具有成本效益的操作,因为我们实际上将所有数据从以前的 ArrayList 复制到容量更大的新 ArrayList。但是我想知道,当 ArrayList 只需要更多 space 时,可能不会进行某些具有容量的操作 - 但更早。我想知道我的输出 "slow indexes" 需要这么长时间,而增加容量是我唯一的想法。这是我的代码:

import java.lang.reflect.Field;
import java.util.ArrayList;
import java.util.List;

public class MainArr {
    ArrayList<Integer> normalList = new ArrayList<Integer>();

    public static void main(String[] args) throws Exception {
        MainArr m = new MainArr();
        m.addElements();
    }

    public void addElements() throws Exception {
        long startTime = System.currentTimeMillis();
        for (int j = 0; j < 20000000; j++) {
            if (j % 500000 == 0) {
            System.out.println("j:" + j + " capacity:" + getCapacity(this.normalList));
        }
        long addTime = System.currentTimeMillis();
        this.normalList.add(j);
        if (System.currentTimeMillis() - addTime > 50) {
            System.out.println("slow index-" + j + " - time:" + (System.currentTimeMillis() - addTime));
        }
    }
    System.out.println("End after:" + (System.currentTimeMillis() - startTime));
}

int getCapacity(List al) throws Exception {
    Field field = ArrayList.class.getDeclaredField("elementData");
    field.setAccessible(true);
    return ((Object[]) field.get(al)).length;
    }
}

输出:

j:0 capacity:0
j:500000 capacity:540217
j:1000000 capacity:1215487
j:1500000 capacity:1823230
j:2000000 capacity:2734845
j:2500000 capacity:2734845
j:3000000 capacity:4102267
j:3500000 capacity:4102267
j:4000000 capacity:4102267
slow index-4102267 - time:1203 //We need more space in ArrayList.That's why it takes some time.
j:4500000 capacity:6153400
j:5000000 capacity:6153400
j:5500000 capacity:6153400
j:6000000 capacity:6153400
j:6500000 capacity:9230100
slow index-6758010 - time:1477 //We dont need to increase capacity. But we stop for a moment...
j:7000000 capacity:9230100 //... and we have the same capacity
j:7500000 capacity:9230100
j:8000000 capacity:9230100
j:8500000 capacity:9230100
j:9000000 capacity:9230100
j:9500000 capacity:13845150 // Somehow capacity is increased insanely fast
j:10000000 capacity:13845150
j:10500000 capacity:13845150
j:11000000 capacity:13845150
j:11500000 capacity:13845150
j:12000000 capacity:13845150
slow index-12426474 - time:3168 //We dont need to increase capacity. But we stop for a moment...
j:12500000 capacity:13845150 //... and we have the same capacity
j:13000000 capacity:13845150
j:13500000 capacity:13845150
j:14000000 capacity:20767725  // Somehow capacity is increased insanely fast
j:14500000 capacity:20767725
slow index-14639924 - time:144  
j:15000000 capacity:20767725
j:15500000 capacity:20767725
j:16000000 capacity:20767725
j:16500000 capacity:20767725
j:17000000 capacity:20767725
j:17500000 capacity:20767725
j:18000000 capacity:20767725
j:18500000 capacity:20767725
j:19000000 capacity:20767725
j:19500000 capacity:20767725
slow index-19980735 - time:218
End after:6990

每次调用add函数时,它都会调用ensureCapacity函数,size+1作为minCapacity参数(列表的大小,不是列表后面的数组)。

您可以在下面查看ensureCapacity的代码:

 public void ensureCapacity(int minCapacity) {
         modCount++;
         int oldCapacity = elementData.length;
         if (minCapacity > oldCapacity) {
             Object oldData[] = elementData;
             int newCapacity = (oldCapacity * 3)/2 + 1;
             if (newCapacity < minCapacity)
                 newCapacity = minCapacity;
             // minCapacity is usually close to size, so this is a win:
             elementData = Arrays.copyOf(elementData, newCapacity);
         }
     }

请注意,只有当 minCapacity 参数大于数组的当前大小时,它才会创建一个新数组。

编辑(谢谢@Jyotsana Nandwani):

In JDK 1.7, new way to calculate resize is :

int newCapacity = oldCapacity + (oldCapacity >> 1)

where right shift operator makes sure to increase the capacity by 50% of old capacity, i.e. 1.5 times

ArrayList 代码经过优化,容量从 10 开始,每次需要更多 space 时增加 1.5 倍。

您可以使用您的程序的修改版本检测精确的增长点:

public void addElements() throws Exception {
    int lastCap = -1;
    for (int j = 0; j < 1000000; j++) {
        this.normalList.add(j);
        int cap = getCapacity(this.normalList);
        if (cap != lastCap) {
            System.out.println("size:" + normalList.size() + " capacity:" + cap);
            lastCap = cap;
        }
    }
}

int getCapacity(List al) throws Exception {
    Field field = ArrayList.class.getDeclaredField("elementData");
    field.setAccessible(true);
    return ((Object[]) field.get(al)).length;
}

prints以下号码:

size:1 capacity:10
size:11 capacity:15
size:16 capacity:22
size:23 capacity:33
size:34 capacity:49
size:50 capacity:73
size:74 capacity:109
... // And so on

source code responsible for growing the listensureCapacity方法中,如下:

int newCapacity = (oldCapacity * 3)/2 + 1;

这相当于整数乘以 1.5。

为了提高性能,尝试在 ArrayList 实例化上定义大容量。例如:

List<Users> users = new ArrayList<>(100000);
users.add(new User("John", "Doe"));

只有在您事先知道所需容量的情况下,这才会对您有所帮助`。如果您不知道将有多少个实例,请考虑使用其他数据结构。

例如,查看 Queue 接口的实现,特别是 LinkedList 的实现。这种数据结构有一个恒定的添加新元素的时间,但当元素位于列表中间时,不利于通过索引获取元素。请注意,LinkedList 还实现了 List 接口以及 ArrayList,因此以下语法是有效的:

List<Users> users = new LinkedList<>();
users.add(new User("John", "Doe"));