为散列函数定义一个 c++20 概念

Question

CppCoreGuidelines 声明应为所有模板参数指定概念。（参见：T.10: Specify concepts for all template arguments）作为定义概念的实践，我正在尝试构建一个哈希表，其中包含为哈希函数和键模板参数定义的概念。

我希望我的哈希表使用两个模板参数，HashFunc 和 Key。 HashFunc 应该是函数对象，Key 应该是函数对象的参数 HashFunc.

也就是说，HashFunc(Key) 应该 return 一个可转换为 size_t 的类型。

在cppreference上，有一个例子定义了概念Hashable。我复制了以下示例：

template<typename T>
concept Hashable = requires(T a) {
    { std::hash<T>{}(a) } -> std::convertible_to<std::size_t>;
};

这个 Hashable 概念对许多用途都有意义。在这些情况下，T 类型对象的哈希函数是特化 std::hash<T>。但是，出于我的目的，我不想假设哈希将是 std::hash<Key>。我希望用户能够提供不同的哈希函数。

由于 HashFunc 和 Key 联系如此紧密，我认为我不能为 HashFunc 和 Key 定义单独的概念。 对吗？ 所以我想定义一个概念 HashConcept 同时处理 HashFunc 和 Key。

所以我定义了一个概念 Hash来处理这两个问题。我尽力定义这个概念，使其符合 Hash here 的命名要求。然后目标是满足4个条件。在此列表下方，我谈到了尝试强制执行这些条件。

return 类型可转换为 std::size_t。
散列函数在程序运行期间显示等式保留（h(k1) == h(k1)。请参阅 C++ Extensions for Ranges 第 19.1.1 节）
如果u是一个左值Key，那么h(u)不会修改u。
“a!=b 的 h(a)==h(b) 的概率应该接近 1.0/std::numeric_limits<std::size_t>::max()。”

这个列表看起来完整吗？

我不相信概念可以强制执行 (4)，并且 (4) 只需要在 comments/documentation 中指明。我相信概念可能能够强制执行 (2) 和 (3)，但我不确定如何执行。 C++ Extensions for Ranges section 19.5定义了概念Callable和RegularCallable，然后说“注意：Callable和RegularCallable的区别是纯语义的。 — 尾注”，表明 (2) 不能强制执行。剩下 (1) 和 (3)。

我定义了一个强制执行 (1) 的概念。

template<typename HashFunc, typename Key>
concept Hash = requires(HashFunc h, Key k) {
    { std::invoke(h, k) } -> std::convertible_to<std::size_t>;
};

这个概念对吗？（例如，我应该使用 requires 或 return 编辑 bool 吗？）我的概念是否可以扩展到解决散列函数的其他要求，例如 (2)-(4)？

下面是一些使用该概念的示例代码。结果是打印 3 到 std::cout.

#include <functional>
#include <concepts>
#include <iostream>

template<typename HashFunc, typename Key>
concept HashConcept = requires(HashFunc h, Key k) {
    { std::invoke(h, k) } -> std::convertible_to<std::size_t>;
};


class HashFunc {
public:
    std::size_t operator()(int i) {
        return static_cast<size_t>(i);
    }
};

template<typename Hash, typename Key>
    requires HashConcept<Hash, Key>
size_t HashConceptUser(Hash h, Key k) {
    return h(k);
}

int main() {
    std::cout << HashConceptUser< HashFunc, int >(HashFunc{}, 3); 

}

Answer 1

Does this list appear complete?

该列表缺少可以说是哈希函数最重要的一个标准：if a == b then h(a) == h(b).

列表中的第 4 个标准是您想要的 good 哈希函数，它本身有些不完整 - 您不只是希望碰撞的可能性很小，你还想要随机分散。哈希函数 h(i) = i 满足第 4 个条件，但不是好的哈希函数。另一方面，h(i) = 0 是一个糟糕的哈希函数，但应该被认为是有效的。

也就是说，C++ 语言概念不能强制执行 任何这些事情 - 你不能强制哈希函数是相等的，你不能强制它不修改它的输入，并且您不能强制执行有关其结果分布的任何事情。这些就是我们所说的语义约束而不是句法约束（C++标准谈到满足一个概念，如果满足句法约束并且建模一个概念，如果满足句法和语义的话）。语义约束是记录在案的要求（在注释中，或只是文档中）而不是编码的要求。

你能做的最好的语法只是验证散列函数是可调用的并给你一个整数：

template <typename F, typename T>
concept HashFor = std::regular_invocable<F, T>
               && std::convertible_to<std::invoke_result_t<F, T>, size_t>;

我在这里使用 regular_invocable 是因为 that concept 添加了您想要的语义约束：函数调用保持相等性并且不修改函数对象或其参数。你也可以这样写：

template <typename F, typename T>
concept HashFor = std::regular_invocable<F, T>
    && requires(F f, T t) {
        { std::invoke(f, t) } -> std::convertible_to<size_t>;
    };

但我会保留 regular_invocable 部分。

为散列函数定义一个 c++20 概念

Defining a c++20 concept for hash functions

c++

hashtable

c++-concepts

c++20