bucket_count 的 libc++ unordered_set.reserve

bucket_count of libc++'s unordered_set.reserve

当我在 libc++ 的空 unordered_set 上调用 reserve(n) 时,如果 n 不是二的幂,则 libc++ 会找到下一个质数 bucket_count,否则它们只需使用 n。这也使得 reserve(15)reserve(16).

具有更大的 bucket_count

libc++ source:

    if (__n == 1)
        __n = 2;
    else if (__n & (__n - 1))
        __n = __next_prime(__n);

我想知道这有什么有用的案例吗?

我知道一些 hash-table 实现可以使用 2 的幂作为大小来避免模计算。但这似乎只有在编译时知道只能使用二次幂的情况下才有意义。

测试演示:https://godbolt.org/z/W5joovP4q

#include <cstdio>
#include <unordered_set>

bool is_prime(int x) {
    if (x < 2) return false;
    if (x % 2 == 0) return x == 2;
    for (int i = 3; i * i <= x; i += 2)
        if (x % i == 0) return false;
    return true;
}

int next_prime(int x) {
    while (!is_prime(x))++x;
    return x;
}

__attribute__((noinline))
void test(int i) {
    std::unordered_set<char> s;
    s.reserve(i);
    int a = s.bucket_count();
    int b = next_prime(i);
    if (a != b) {
        printf("%6d %6d %6d\n", i, a, b);
    }
}

int main() {
    for (int i = 0; i < 3000; ++i)
        test(i);
}

程序标准输出:

     0      0      2
     4      4      5
     8      8     11
    16     16     17
    32     32     37
    64     64     67
   128    128    131
   256    256    257
   512    512    521
  1024   1024   1031
  2048   2048   2053

original commit message 阐明了这一点。简而言之,libc++ 最初只使用质数。此提交引入了对 power-of-2 数字的优化,以防用户明确请求它们。

另请注意,该提交中的新函数 __constrain_hash() 检查它是否为 power-of-2,然后不使用模运算。根据提交消息,不使用模运算可以抵消输入是否为 power-of-2 数字的额外检查的成本。因此,即使您不知道编译时的信息,您也可以获得性能提升。

引用commit message

This commit establishes a new bucket_count policy in the unordered containers: The policy now allows a power-of-2 number of buckets to be requested (and that request honored) by the client. And if the number of buckets is set to a power of 2, then the constraint of the hash to the number of buckets uses & instead of %. If the client does not specify a number of buckets, then the policy remains unchanged: a prime number of buckets is selected. The growth policy is that the number of buckets is roughly doubled when needed. While growing, either the prime, or the power-of-2 strategy will be preserved. There is a small run time cost for putting in this switch. For very cheap hash functions, e.g. identity for int, the cost can be as high as 18%. However with more typical use cases, e.g. strings, the cost is in the noise level. I've measured cases with very cheap hash functions (int) that using a power-of-2 number of buckets can make look up about twice as fast. However I've also noted that a power-of-2 number of buckets is more susceptible to accidental catastrophic collisions. Though I've also noted that accidental catastrophic collisions are also possible when using a prime number of buckets (but seems far less likely). In short, this patch adds an extra tuning knob for those clients trying to get the last bit of performance squeezed out of their hash containers. Casual users of the hash containers will not notice the introduction of this tuning knob. Those clients who swear by power-of-2 hash containers can now opt-in to that strategy. Clients who prefer a prime number of buckets can continue as they have.

引入此更改的提交在这里:https://github.com/llvm/llvm-project/commit/4cb38a82a26cb3f4ac86834659e0af5f873c1d5b

它有一个很长的提交消息,但它是关于故意遵守 power-of-two 存储桶计数。

如果您查看更改,如果 bucket_count 是 2 的幂,则实现使用 & 而不是 %。它不像在编译时那样高效,但提交消息声称它不像寻找复杂的哈希函数那样昂贵。

size_t
__constrain_hash(size_t __h, size_t __bc)
{
    return !(__bc & (__bc - 1)) ? __h & (__bc - 1) : __h % __bc;
}