为什么 Python 的无穷散列有 π 的数字?

Why does Python's hash of infinity have the digits of π?

Python 中的无穷散列具有与 pi:

匹配的数字
>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159

这是巧合还是故意的?

_PyHASH_INF 等于 defined as a constant 等于 314159

我找不到关于此的任何讨论或给出原因的评论。我认为它或多或少是任意选择的。我想只要他们不对其他哈希值使用相同的有意义的值,就没有关系。

总结:这不是巧合; _PyHASH_INF is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.


hash(float('inf'))的值是built-in数值类型散列函数的system-dependent参数之一,is also available作为sys.hash_info.inf在Python 3:

>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159

(同样的结果 with PyPy。)


代码方面,hash is a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hash attribute of the built-in float type (PyTypeObject PyFloat_Type), which is the float_hash function, defined as return _Py_HashDouble(v->ob_fval), which in turn has

    if (Py_IS_INFINITY(v))
        return v > 0 ? _PyHASH_INF : -_PyHASH_INF;

其中 _PyHASH_INFdefined as 314159:

#define _PyHASH_INF 314159

就历史而言,在 Python 代码(您可以通过 git bisectgit log -S 314159 -p 找到)中首次提及 314159 已添加通过 Tim Peters in August 2000, in what is now commit 39dce293cpython git 存储库中。

提交消息说:

Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470. This was a misleading bug -- the true "bug" was that hash(x) gave an error return when x is an infinity. Fixed that. Added new Py_IS_INFINITY macro to pyport.h. Rearranged code to reduce growing duplication in hashing of float and complex numbers, pushing Trent's earlier stab at that to a logical conclusion. Fixed exceedingly rare bug where hashing of floats could return -1 even if there wasn't an error (didn't waste time trying to construct a test case, it was simply obvious from the code that it could happen). Improved complex hash so that hash(complex(x, y)) doesn't systematically equal hash(complex(y, x)) anymore.

特别是,在这次提交中,他删除了 Objects/floatobject.cstatic long float_hash(PyFloatObject *v) 的代码并使其成为 return _Py_HashDouble(v->ob_fval);,并在 long _Py_HashDouble(double v) 的定义中 Objects/object.c 他添加了以下行:

        if (Py_IS_INFINITY(intpart))
            /* can't convert to long int -- arbitrary */
            v = v < 0 ? -271828.0 : 314159.0;

如前所述,这是一个随意的选择。注意271828是由e.

的前几个小数digit组成的

相关的后续提交:

的确如此,

sys.hash_info.inf

returns314159。该值不是生成的,它是内置在源代码中的。 事实上,

hash(float('-inf'))

returns -271828,或近似-e,在python 2 (it's -314159 now).

历史上最著名的两个无理数被用作哈希值这一事实表明这不太可能是巧合。