为什么 Python 3 比 Python 2 慢得多？

Question

我一直在试图理解为什么 Python 3 在某些情况下实际上比 Python 2 花费更多时间，以下是我从 python 验证的几个案例3.4 至 python 2.7.

注意：我已经回答了一些问题，例如 Why is there no xrange function in Python3? and loop in python3 much slower than python2 and Same code slower in Python3 as compared to Python2，但我觉得我没有得到这个问题背后的真正原因。

我已经尝试过这段代码来展示它是如何发挥作用的：

MAX_NUM = 3*10**7

# This is to make compatible with py3.4.
try:
    xrange
except:
    xrange = range


def foo():
    i = MAX_NUM
    while i> 0:
        i -= 1

def foo_for():
    for i in xrange(MAX_NUM):
        pass

当我用 py3.4 和 py2.7 尝试运行这个程序时，我得到了下面结果。

注意：这些统计数据来自具有 2.6Ghz 处理器的 64 bit 机器，并在单循环中使用 time.time() 计算时间。

Output : Python 3.4
-----------------
2.6392083168029785
0.9724123477935791

Output: Python 2.7
------------------
1.5131521225
0.475143909454

我真的不认为 while 或 xrange 从 2.7 到 3.4 有变化，我知道 range 已经开始充当 xrange 在 py3.4 中，但如文档所述

range() now behaves like xrange() used to behave, except it works with values of arbitrary size. The latter no longer exists.

这意味着从 xrange 到 range 的更改非常等同于名称更改，但可以使用任意值。

我也验证了反汇编的字节码。

下面是函数 foo():

的反汇编字节码

Python 3.4:
--------------- 

 13           0 LOAD_GLOBAL              0 (MAX_NUM)
              3 STORE_FAST               0 (i)

 14           6 SETUP_LOOP              26 (to 35)
        >>    9 LOAD_FAST                0 (i)
             12 LOAD_CONST               1 (0)
             15 COMPARE_OP               4 (>)
             18 POP_JUMP_IF_FALSE       34

 15          21 LOAD_FAST                0 (i)
             24 LOAD_CONST               2 (1)
             27 INPLACE_SUBTRACT
             28 STORE_FAST               0 (i)
             31 JUMP_ABSOLUTE            9
        >>   34 POP_BLOCK
        >>   35 LOAD_CONST               0 (None)
             38 RETURN_VALUE

python 2.7
-------------

 13           0 LOAD_GLOBAL              0 (MAX_NUM)
              3 STORE_FAST               0 (i)

 14           6 SETUP_LOOP              26 (to 35)
        >>    9 LOAD_FAST                0 (i)
             12 LOAD_CONST               1 (0)
             15 COMPARE_OP               4 (>)
             18 POP_JUMP_IF_FALSE       34

 15          21 LOAD_FAST                0 (i)
             24 LOAD_CONST               2 (1)
             27 INPLACE_SUBTRACT    
             28 STORE_FAST               0 (i)
             31 JUMP_ABSOLUTE            9
        >>   34 POP_BLOCK           
        >>   35 LOAD_CONST               0 (None)
             38 RETURN_VALUE

下面是函数 foo_for():

的反汇编字节码

Python: 3.4

 19           0 SETUP_LOOP              20 (to 23)
              3 LOAD_GLOBAL              0 (xrange)
              6 LOAD_GLOBAL              1 (MAX_NUM)
              9 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             12 GET_ITER
        >>   13 FOR_ITER                 6 (to 22)
             16 STORE_FAST               0 (i)

 20          19 JUMP_ABSOLUTE           13
        >>   22 POP_BLOCK
        >>   23 LOAD_CONST               0 (None)
             26 RETURN_VALUE


Python: 2.7
-------------

 19           0 SETUP_LOOP              20 (to 23)
              3 LOAD_GLOBAL              0 (xrange)
              6 LOAD_GLOBAL              1 (MAX_NUM)
              9 CALL_FUNCTION            1
             12 GET_ITER            
        >>   13 FOR_ITER                 6 (to 22)
             16 STORE_FAST               0 (i)

 20          19 JUMP_ABSOLUTE           13
        >>   22 POP_BLOCK           
        >>   23 LOAD_CONST               0 (None)
             26 RETURN_VALUE

如果我们比较两个字节码，它们会生成相同的反汇编字节码。

现在我想知道从 2.7 到 3.4 的什么变化真正导致了给定代码段中执行时间的巨大变化。

Answer 1

区别在于 int 类型的实现。 Python 3.x 专门使用任意大小的整数类型（2.x 中的 long），而 Python 2.x 中的值最大为 sys.maxint 使用更简单的 int 类型，它在底层使用简单的 C long。

一旦将循环限制为 long 整数，Python 3.x 会更快：

>>> from timeit import timeit
>>> MAX_NUM = 3*10**3
>>> def bar():
...     i = MAX_NUM + sys.maxsize
...     while i > sys.maxsize:
...         i -= 1
...

Python 2:

>>> timeit(bar, number=10000)
5.704327821731567

Python 3:

>>> timeit(bar, number=10000)
3.7299320790334605

我用的是sys.maxsize，因为sys.maxint是从Python3去掉的，但是整数值基本一样。

因此，Python 2 中的速度差异仅限于第一个 (2 ** 63) - 64 位上的 1 个整数，(2 ** 31) - 32 位系统上的第一个整数。

由于您不能在 Python 2 上将 long 类型与 xrange() 一起使用，因此我没有对该函数进行比较。

为什么 Python 3 比 Python 2 慢得多？

Why is Python 3 is considerably slower than Python 2?

python

performance

cpython

python-2.7

python-3.x