为什么 std::unordered_map 有保留方法？

Question

根据 this 你不能为 std::map 保留 space:

No, the members of the map are internally stored in a tree structure. There is no way to build the tree until you know the keys and values that are to be stored.

从这里可以明显看出为什么 std::map 缺少 reserve() 方法，而 cppreference.com 却没有。但是，std::unordered_map 是否有 reserve() 方法，但是当我尝试将它与 operator[]、insert() 或 [=18 一起使用时=] 尽管我先调用了 reserve()，但它们都去分配内存。

这是怎么回事？为什么 reserve() 不能正确保留所需的 space？如果 map 不能预先分配内存，那么为什么 std::unordered_map 甚至首先有一个 reserve() 方法？

Answer 1

unordered_map 容器有一个 reserve 方法，因为它是使用桶实现的，而不是 map 中的树。

一个桶是：

a slot in the container's internal hash table to which elements are assigned based on the hash value of their key. Buckets are numbered from 0 to (bucket_count-1). (source)

单个桶可容纳可变数量的项目。这个数字是基于 load_factor。当 load_factor 达到某个阈值时，容器会增加桶的数量并重新散列映射。

当您调用 reserve(n) 时，容器会创建足够的桶来容纳至少 n 项。

这与 rehash(n) 形成对比，后者直接将桶数设置为 n 并触发整个哈希的重建 table.

另请参阅：Pre-allocating buckets in a C++ unordered_map

根据评论进行编辑

由于我不知道评论中提出的问题的确切答案，而且我的初步研究也没有取得成果，所以我决定进行实验测试。

供参考，问题归结为：

Could you please explain if reserving buckets for n elements is the same as allocating memory for n elements?

根据 this answer，准确检索 unordered_map 中分配的 space 的大小是棘手且不可靠的。所以我决定使用 Visual Studio 2015 年的诊断工具。

首先，我的测试用例如下：

#include <unordered_map>
#include <cstdint>

struct Foo
{
    Foo() : x(0.0f), y(0.0f), z(0.0f) { }

    float x;
    float y;
    float z;
};

int32_t main(int32_t argc, char** argv)
{
    std::unordered_map<uint32_t, Foo> mapNoReserve;
    std::unordered_map<uint32_t, Foo> mapReserve;

    // --> Snapshot A

    mapReserve.reserve(1000);

    // --> Snapshot B

    for(uint32_t i = 0; i < 1000; ++i)
    {
        mapNoReserve.insert(std::make_pair(i, Foo()));
        mapReserve.insert(std::make_pair(i, Foo()));
    }

    // -> Snapshot C

    return 0;
}

在评论指出的地方，我拍了一张内存快照。

结果如下：

快照 A:

┌──────────────┬──────────────┬──────────────┐
|     Map      | Size (Bytes) | Bucket Count |
|--------------|--------------|--------------|
| mapNoReserve | 64           | 8            |
| mapReserve   | 64           | 8            |
└──────────────┴──────────────┴──────────────┚

快照 B:

┌──────────────┬──────────────┬──────────────┐
|     Map      | Size (Bytes) | Bucket Count |
|--------------|--------------|--------------|
| mapNoReserve | 64           | 8            |
| mapReserve   | 8231         | 1024         |
└──────────────┴──────────────┴──────────────┚

快照 C:

┌──────────────┬──────────────┬──────────────┐
|     Map      | Size (Bytes) | Bucket Count |
|--------------|--------------|--------------|
| mapNoReserve | 24024        | 1024         |
| mapReserve   | 24024        | 1024         |
└──────────────┴──────────────┴──────────────┚

解读：

正如您从快照中看到的那样，一旦我们开始向它们添加元素，两个地图的大小似乎都会增加，即使是调用 reserve.

的地图也是如此

即使内存仍然分配，reserve 是否也有好处？我会说是的有两个原因：（1）它为桶预分配内存，（2）它可以避免需要 rehash，如前所述，它会完全重建地图。

为什么 std::unordered_map 有保留方法？

Why does std::unordered_map have a reserve method?

c++

memory

stdmap

visual-c++