为什么我的 std::unordered_map 访问时间不稳定

Question

我写了一些代码来测试我的无序地图性能，使用 2 分量向量作为键。

std::unordered_map<Vector2i, int> m;                                                                      

for(int i = 0; i < 1000; ++i)                                                                             
    for(int j = 0; j < 1000; ++j)                                                                         
        m[Vector2i(i,j)] = i*j+27*j;                                                                      

clock.restart();                                                                                          

auto found = m.find(Vector2i(0,5));                                                                                                                                                            

std::cout << clock.getElapsedTime().asMicroseconds() << std::endl;

以上代码的输出：56（微秒）当我将 for 循环中的 1000 替换为 100 时，输出为 2（微秒）时间不是应该是恒定的吗？

我的 Vector2i 的哈希函数：

namespace std                                                                                                    
{

   template<>                                                                                                   
    struct hash<Vector2i>                                                                                        
    {                                                                                                            
        std::size_t operator()(const Vector2i& k) const                                                          
        {                                                                                                        
            using std::size_t;                                                                                   
            using std::hash;                                                                                     
            using std::string;                                                                                   

            return (hash<int>()(k.x)) ^ (hash<int>()(k.y) << 1);                                                 
        }                                                                                                        

    };                                                                                                           


}

编辑： 我添加了这段代码来计算 for 循环之后的碰撞：

for (size_t bucket = 0; bucket != m.bucket_count(); ++bucket)                                             
    if (m.bucket_size(bucket) > 1)                                                                        
         ++collisions;

具有 100*100 个元素：碰撞次数 = 256

1000*1000 个元素：碰撞次数 = 2048

Answer 1

哈希 table 保证 constant amortized time. If the hash table is well balanced (i.e., the hash function is good), then most elements will be evenly distributed. However, if the hash function is not so good, you may have lots of collisions, in which case to access an element you'd need to traverse usually a linked list (where you store the elements that collided). So make sure first the load factor 和哈希函数在您的情况下是可以的。最后，确保在发布模式下编译代码，并启用优化（例如 -O3 for g++/clang++）。

这个问题也可能有用：How to create a good hash_combine with 64 bit output (inspired by boost::hash_combine)。

为什么我的 std::unordered_map 访问时间不稳定

Why is my std::unordered_map access time not constant

c++

unordered-map

hashtable

time-complexity

c++11