为什么 Python 中的浮点除法用较小的数字更快?

Why is floating-point division in Python faster with smaller numbers?

在回答的过程中,遇到了无法解释的问题。

给定以下 Python 3.5 代码:

import time

def di(n):
    for i in range(10000000): n / 101

i = 10
while i < 1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000:
    start = time.clock()
    di(i)
    end = time.clock()
    print("On " + str(i) + " " + str(end-start))
    i *= 10000

输出为:

On 10 0.546889
On 100000 0.545004
On 1000000000 0.5454929999999998
On 10000000000000 0.5519709999999998
On 100000000000000000 1.330797
On 1000000000000000000000 1.31053
On 10000000000000000000000000 1.3393129999999998
On 100000000000000000000000000000 1.3524339999999997
On 1000000000000000000000000000000000 1.3817269999999997
On 10000000000000000000000000000000000000 1.3412670000000002
On 100000000000000000000000000000000000000000 1.3358929999999987
On 1000000000000000000000000000000000000000000000 1.3773859999999996
On 10000000000000000000000000000000000000000000000000 1.3326890000000002
On 100000000000000000000000000000000000000000000000000000 1.3704769999999993
On 1000000000000000000000000000000000000000000000000000000000 1.3235019999999995
On 10000000000000000000000000000000000000000000000000000000000000 1.357647
On 100000000000000000000000000000000000000000000000000000000000000000 1.3341190000000012
On 1000000000000000000000000000000000000000000000000000000000000000000000 1.326544000000002
On 10000000000000000000000000000000000000000000000000000000000000000000000000 1.3671139999999973
On 100000000000000000000000000000000000000000000000000000000000000000000000000000 1.3630120000000012
On 1000000000000000000000000000000000000000000000000000000000000000000000000000000000 1.3600200000000022
On 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000 1.3189189999999975
On 100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 1.3503469999999993

大家可以看到,大致有两种,一种是小数,一种是大数。

Python 2.7 使用以下函数来保留语义,结果相同:

def di(n):
    for i in xrange(10000000): n / 101.0

在同一台机器上,我得到:

On 10 0.617427
On 100000 0.61805
On 1000000000 0.6366
On 10000000000000 0.620919
On 100000000000000000 0.616695
On 1000000000000000000000 0.927353
On 10000000000000000000000000 1.007156
On 100000000000000000000000000000 0.98597
On 1000000000000000000000000000000000 0.99258
On 10000000000000000000000000000000000000 0.966753
On 100000000000000000000000000000000000000000 0.992684
On 1000000000000000000000000000000000000000000000 0.991711
On 10000000000000000000000000000000000000000000000000 0.994703
On 100000000000000000000000000000000000000000000000000000 0.978877
On 1000000000000000000000000000000000000000000000000000000000 0.982035
On 10000000000000000000000000000000000000000000000000000000000000 0.973266
On 100000000000000000000000000000000000000000000000000000000000000000 0.977911
On 1000000000000000000000000000000000000000000000000000000000000000000000 0.996857
On 10000000000000000000000000000000000000000000000000000000000000000000000000 0.972555
On 100000000000000000000000000000000000000000000000000000000000000000000000000000 0.985676
On 1000000000000000000000000000000000000000000000000000000000000000000000000000000000 0.987412
On 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0.997207
On 100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0.970129

为什么小数与大数的浮点除法之间存在这种一致的差异?这是否与 Python 在内部对较小的数字使用浮点数而对较大的数字使用双精度数有关?

它与 Python 将精确整数存储为 Bignums 更多有关。

在Python 2.7中,计算整数a / float fb ,首先将整数转换为浮点数。如果整数存储为 Bignum [注 1],则需要更长的时间。因此,成本差异不在于部门,而在于部门。它是整数(可能是 Bignum)到双精度数的转换。

Python 3 对 integer a / float 进行相同的计算fb,但 整数 a / 整数 b,它尝试计算最接近的可表示结果,这可能与朴素的 float(a) / float(b) 略有不同。 (这类似于经典的双舍入问题。)

如果 float(a)float(b) 都是精确的(即 ab 都不大于 53 位),那么天真的解决方案有效,结果只需要除以两个双精度浮点数。

否则,进行多精度除法生成正确的53位尾数(指数单独计算),并将结果精确转换为浮点数。这种划分有两种可能性:如果 b 足够小以适合单个 Bignum 单元(适用于 OP 中的基准),则为快速通道;当 [=14 时,则为较慢的一般 Bignum 划分=] 更大。

在上述情况的none中,观察到的速度差异与硬件执行浮点除法的速度有关。对于原来的Python 3.5测试,区别在于是否进行浮点除法或Bignum除法;对于 Python 2.7 的情况,差异与将 Bignum 转换为 double 的必要性有关。

感谢@MarkDickinson 的澄清,以及指向实现该算法的 the source code (with a long and useful comment) 的指针。


备注

  1. 在 Python 3 中,整数 总是 存储为 Bignums。 Python 2 有不同的类型 int(64 位整数)和 long(Bignums)。在实践中,由于 Python 3 在 Bignum 只有一个“腿”时经常使用优化算法,因此“小”和“大”整数之间的差异仍然很明显。

正如@rici 所说,这是更大的整数格式。我将最初的 10 更改为 10.0 ...这是结果,时间没有显着变化。

On 10.0 1.12
On 100000.0 0.79
On 1000000000.0 0.79
On 1e+13 0.77
On 1e+17 0.78
On 1e+21 0.79
On 1e+25 0.77
On 1e+29 0.8
On 1e+33 0.77
On 1e+37 0.8
On 1e+41 0.78
On 1e+45 0.78
On 1e+49 0.78
On 1e+53 0.79
On 1e+57 0.77
On 1e+61 0.8
On 1e+65 0.77
On 1e+69 0.79
On 1e+73 0.77
On 1e+77 0.78
On 1e+81 0.78
On 1e+85 0.78
On 1e+89 0.77