为什么 C++ STL 向量在做很多保留时会慢 1000 倍？

Question

我运行陷入了困境运行。

在我的程序中，我有一个循环将一堆数据组合在一个巨大的向量中。我试图弄清楚为什么它运行变得如此缓慢，尽管看起来我正在尽一切努力以高效的方式分配内存。

在我的程序中，很难确定组合数据的最终向量应该有多大，但每条数据的大小在处理时是已知的。因此，我没有一次性 reserving 和 resizing 组合数据向量，而是为每个数据块保留了足够的 space添加到更大的矢量。那是我运行进入这个问题的时候，这个问题可以使用下面的简单片段重复：

std::vector<float> arr1;
std::vector<float> arr2;
std::vector<float> arr3;
std::vector<float> arr4;
int numLoops = 10000;
int numSubloops = 50;

{
    // Test 1
    // Naive test where no pre-allocation occurs

    for (int q = 0; q < numLoops; q++)
    {
        for (int g = 0; g < numSubloops; g++)
        {
            arr1.push_back(q * g);
        }
    }
}

{
    // Test 2
    // Ideal situation where total amount of data is reserved beforehand

    arr2.reserve(numLoops * numSubloops);
    for (int q = 0; q < numLoops; q++)
    {
        for (int g = 0; g < numSubloops; g++)
        {
            arr2.push_back(q * g);
        }
    }
}

{
    // Test 3
    // Total data is not known beforehand, so allocations made for each
    // data chunk as they are processed using 'resize' method

    int arrInx = 0;
    for (int q = 0; q < numLoops; q++)
    {
        arr3.resize(arr3.size() + numSubloops);
        for (int g = 0; g < numSubloops; g++)
        {
            arr3[arrInx++] = q * g;
        }
    }
}

{
    // Test 4
    // Total data is not known beforehand, so allocations are made for each
    // data chunk as they are processed using the 'reserve' method

    for (int q = 0; q < numLoops; q++)
    {
        arr4.reserve(arr4.size() + numSubloops);
        for (int g = 0; g < numSubloops; g++)
        {
            arr4.push_back(q * g);
        }
    }
}

本次测试在Visual Studio 2017中编译后的结果如下：

Test 1: 7 ms
Test 2: 3 ms
Test 3: 4 ms
Test 4: 4000 ms

为什么运行ning次出现巨大差异？

为什么调用 reserve 多次，然后调用 push_back 比调用 resize 多次，然后直接索引访问花费的时间长 1000 倍？

它可能比根本不包含预分配的天真方法花费 500 倍的时间有什么意义？

Answer 1

How does it make any sense that it could take 500x longer than the naive approach which includes no pre-allocations at all?

你错了。您所说的 'naive' 方法确实会进行预分配。它们只是在幕后完成，并且很少在对 push_back 的调用中完成。它不只是在您每次调用 push_back 时再为一个元素分配空间。它分配一些数量，这是当前容量的一个因素（通常在 1.5 倍到 2 倍之间）。然后它不需要再次分配，直到该容量用完。这比每次添加 50 个元素时都进行分配的循环效率高得多，而不考虑当前容量。

Answer 2

@Benjamin Lindley 的回答解释了 std::vector 的容量。但是，为什么第4个测试用例那么慢，其实是标准库的一个实现细节。

[vector.capacity]

void reserve(size_type n);

...

Effects: A directive that informs a vector of a planned change in size, so that it can manage the storage allocation accordingly. After reserve(), capacity() is greater or equal to the argument of reserve if reallocation happens; and equal to the previous value of capacity() otherwise. Reallocation happens at this point if and only if the current capacity is less than the argument of reserve().

因此不是C++标准保证reserve()后更大的容量，实际容量应该是请求的容量。我个人认为，当收到如此大的容量请求时，实施遵循一些特定的政策并不是不合理的。不过，我也在我的机器上测试过，好像STL做的是最简单的事情。

为什么 C++ STL 向量在做很多保留时会慢 1000 倍？

Why are C++ STL vectors 1000x slower when doing many reserves?

c++

stl

resize

vector