为什么 Python 的无穷散列有 π 的数字？

Question

Python 中的无穷散列具有与 pi:

匹配的数字

>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159

这是巧合还是故意的？

Answer 1

_PyHASH_INF 等于 defined as a constant 等于 314159。

我找不到关于此的任何讨论或给出原因的评论。我认为它或多或少是任意选择的。我想只要他们不对其他哈希值使用相同的有意义的值，就没有关系。

Answer 2

总结：这不是巧合； _PyHASH_INF is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.

hash(float('inf'))的值是built-in数值类型散列函数的system-dependent参数之一，is also available作为sys.hash_info.inf在Python 3:

>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159

（同样的结果 with PyPy。）

代码方面，hash is a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hash attribute of the built-in float type (PyTypeObject PyFloat_Type), which is the float_hash function, defined as return _Py_HashDouble(v->ob_fval), which in turn has

    if (Py_IS_INFINITY(v))
        return v > 0 ? _PyHASH_INF : -_PyHASH_INF;

其中 _PyHASH_INF 是 defined as 314159:

#define _PyHASH_INF 314159

就历史而言，在 Python 代码（您可以通过 git bisect 或 git log -S 314159 -p 找到）中首次提及 314159 已添加通过 Tim Peters in August 2000, in what is now commit 39dce293 在 cpython git 存储库中。

提交消息说：

Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470. This was a misleading bug -- the true "bug" was that hash(x) gave an error return when x is an infinity. Fixed that. Added new Py_IS_INFINITY macro to pyport.h. Rearranged code to reduce growing duplication in hashing of float and complex numbers, pushing Trent's earlier stab at that to a logical conclusion. Fixed exceedingly rare bug where hashing of floats could return -1 even if there wasn't an error (didn't waste time trying to construct a test case, it was simply obvious from the code that it could happen). Improved complex hash so that hash(complex(x, y)) doesn't systematically equal hash(complex(y, x)) anymore.

特别是，在这次提交中，他删除了 Objects/floatobject.c 中 static long float_hash(PyFloatObject *v) 的代码并使其成为 return _Py_HashDouble(v->ob_fval);，并在 long _Py_HashDouble(double v) 的定义中 Objects/object.c 他添加了以下行：

        if (Py_IS_INFINITY(intpart))
            /* can't convert to long int -- arbitrary */
            v = v < 0 ? -271828.0 : 314159.0;

如前所述，这是一个随意的选择。注意271828是由e.

的前几个小数digit组成的

相关的后续提交：

By Mark Dickinson in Apr 2010 (also)，使 Decimal 类型的行为与
By Mark Dickinson in Apr 2010 (also), 将此检查移至顶部并添加测试用例
By Mark Dickinson in May 2010 as issue 8188, completely rewriting the hash function to its current implementation，但保留这种特殊情况，给常数一个名称 _PyHASH_INF（同时删除 271828，这就是 Python 3 [=40 中的原因=] returns -314159 而不是像 Python 中那样 -271828 2)
By Raymond Hettinger in Jan 2011, adding an explicit example in the "What's new" for Python 3.2 of sys.hash_info showing the above value. (See here.)
By Stefan Krah in Mar 2012 修改 Decimal 模块但保留此散列。
By Christian Heimes in Nov 2013，将 _PyHASH_INF 的定义从 Include/pyport.h 移动到现在所在的 Include/pyhash.h。

Answer 3

的确如此，

sys.hash_info.inf

returns314159。该值不是生成的，它是内置在源代码中的。事实上，

hash(float('-inf'))

returns -271828，或近似-e，在python 2 (it's -314159 now).

历史上最著名的两个无理数被用作哈希值这一事实表明这不太可能是巧合。

为什么 Python 的无穷散列有 π 的数字？

Why does Python's hash of infinity have the digits of π?

python

math

floating-point

hash

pi