CPython:为什么 3 行脚本需要在解释器中执行远远超过 3 个周期才能执行?
CPython: Why does a 3-line script require far more than 3 cycles in the interpreter to execute?
刚看了this Youtube lecturePhilip Guo的CPython Internals,有一点很疑惑
在25:55处,他修改了CPython的C源代码,在运行所有字节码指令的无限循环的开头插入printf(“hello\n”)
;您可以通过以下方式执行相同操作:
- 正在下载 Python 2.7 C 源代码
- 打开文件
Python/ceval.c
- 找到无限求值循环的起点,
for (;;) {
- 将行
printf('hello\n');
添加为无限循环的第一行。
- 运行
configure
和 make
构建 Python 二进制文件。
他写了3行test.py:
X = 1
Y = 2
print X + Y
不解的是,当他用修改后的解释器运行test.py时,为什么在我们看到“3”之前有这么多的“hello”?
那 3 行代码应该只编译成几个字节码指令,加载值 1,加载值 2 和调用打印的指令,所以我可以想象当执行从 [=44 编译的字节码时=], 我们应该只看到几个 "hello".
所以编译器在编译外部Python脚本之前实际上生成了许多内部字节码指令?
您看到这么多 hello
打印的原因有两个:
- Python 没有针对每个可能的 Python 语句的特殊字节码。相反,语句将使用字节码的组合。
- Python 解释器导入一系列 Python 模块 只是为了启动 运行ning。您可以 运行 带有
-v
开关的常规 Python 解释器来查看每次导入的内容。每个模块都由多个语句组成,因此在您开始使用 运行ning. 的小脚本之前,需要经过相当多的字节码。
如果我将这 3 行放入 test.py
并将我未修改的 Python 2.7 二进制文件用于 运行 那,通过 -v
开关,我看到:
$ python2.7 -v test.py
# installing zipimport hook
import zipimport # builtin
# installed zipimport hook
# /..../lib/python2.7/site.pyc matches /..../lib/python2.7/site.py
import site # precompiled from /..../lib/python2.7/site.pyc
# /..../lib/python2.7/os.pyc matches /..../lib/python2.7/os.py
import os # precompiled from /..../lib/python2.7/os.pyc
import errno # builtin
import posix # builtin
# /..../lib/python2.7/posixpath.pyc matches /..../lib/python2.7/posixpath.py
import posixpath # precompiled from /..../lib/python2.7/posixpath.pyc
# /..../lib/python2.7/stat.pyc matches /..../lib/python2.7/stat.py
import stat # precompiled from /..../lib/python2.7/stat.pyc
# /..../lib/python2.7/genericpath.pyc matches /..../lib/python2.7/genericpath.py
import genericpath # precompiled from /..../lib/python2.7/genericpath.pyc
# /..../lib/python2.7/warnings.pyc matches /..../lib/python2.7/warnings.py
import warnings # precompiled from /..../lib/python2.7/warnings.pyc
# /..../lib/python2.7/linecache.pyc matches /..../lib/python2.7/linecache.py
import linecache # precompiled from /..../lib/python2.7/linecache.pyc
# /..../lib/python2.7/types.pyc matches /..../lib/python2.7/types.py
import types # precompiled from /..../lib/python2.7/types.pyc
# /..../lib/python2.7/UserDict.pyc matches /..../lib/python2.7/UserDict.py
import UserDict # precompiled from /..../lib/python2.7/UserDict.pyc
# /..../lib/python2.7/_abcoll.pyc matches /..../lib/python2.7/_abcoll.py
import _abcoll # precompiled from /..../lib/python2.7/_abcoll.pyc
# /..../lib/python2.7/abc.pyc matches /..../lib/python2.7/abc.py
import abc # precompiled from /..../lib/python2.7/abc.pyc
# /..../lib/python2.7/_weakrefset.pyc matches /..../lib/python2.7/_weakrefset.py
import _weakrefset # precompiled from /..../lib/python2.7/_weakrefset.pyc
import _weakref # builtin
# /..../lib/python2.7/copy_reg.pyc matches /..../lib/python2.7/copy_reg.py
import copy_reg # precompiled from /..../lib/python2.7/copy_reg.pyc
import encodings # directory /..../lib/python2.7/encodings
# /..../lib/python2.7/encodings/__init__.pyc matches /..../lib/python2.7/encodings/__init__.py
import encodings # precompiled from /..../lib/python2.7/encodings/__init__.pyc
# /..../lib/python2.7/codecs.pyc matches /..../lib/python2.7/codecs.py
import codecs # precompiled from /..../lib/python2.7/codecs.pyc
import _codecs # builtin
# /..../lib/python2.7/encodings/aliases.pyc matches /..../lib/python2.7/encodings/aliases.py
import encodings.aliases # precompiled from /..../lib/python2.7/encodings/aliases.pyc
# /..../lib/python2.7/encodings/utf_8.pyc matches /..../lib/python2.7/encodings/utf_8.py
import encodings.utf_8 # precompiled from /..../lib/python2.7/encodings/utf_8.pyc
Python 2.7.15 (default, May 7 2018, 17:08:03)
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
3
# -- clean-up output omitted --
其中的每个 import ...
行引用一个内置模块(Python 二进制文件的一部分,用 C 实现)或 .pyc
字节码缓存文件。在脚本代码 运行.
之前,有 17 个这样的文件被导入
主脚本中的 3 行代码转换为另外 9 条字节码指令:
>>> import dis
>>> dis.dis(compile(r'''\
... X = 1
... Y = 2
... print X + Y
... ''', '', 'exec'))
2 0 LOAD_CONST 0 (1)
3 STORE_NAME 0 (X)
3 6 LOAD_CONST 1 (2)
9 STORE_NAME 1 (Y)
4 12 LOAD_NAME 0 (X)
15 LOAD_NAME 1 (Y)
18 BINARY_ADD
19 PRINT_ITEM
20 PRINT_NEWLINE
21 LOAD_CONST 2 (None)
24 RETURN_VALUE
(我忽略了最后的 2 个字节码,编码了一个额外的 return None
,这并不真正适用于模块)。
刚看了this Youtube lecturePhilip Guo的CPython Internals,有一点很疑惑
在25:55处,他修改了CPython的C源代码,在运行所有字节码指令的无限循环的开头插入printf(“hello\n”)
;您可以通过以下方式执行相同操作:
- 正在下载 Python 2.7 C 源代码
- 打开文件
Python/ceval.c
- 找到无限求值循环的起点,
for (;;) {
- 将行
printf('hello\n');
添加为无限循环的第一行。 - 运行
configure
和make
构建 Python 二进制文件。
他写了3行test.py:
X = 1
Y = 2
print X + Y
不解的是,当他用修改后的解释器运行test.py时,为什么在我们看到“3”之前有这么多的“hello”?
那 3 行代码应该只编译成几个字节码指令,加载值 1,加载值 2 和调用打印的指令,所以我可以想象当执行从 [=44 编译的字节码时=], 我们应该只看到几个 "hello".
所以编译器在编译外部Python脚本之前实际上生成了许多内部字节码指令?
您看到这么多 hello
打印的原因有两个:
- Python 没有针对每个可能的 Python 语句的特殊字节码。相反,语句将使用字节码的组合。
- Python 解释器导入一系列 Python 模块 只是为了启动 运行ning。您可以 运行 带有
-v
开关的常规 Python 解释器来查看每次导入的内容。每个模块都由多个语句组成,因此在您开始使用 运行ning. 的小脚本之前,需要经过相当多的字节码。
如果我将这 3 行放入 test.py
并将我未修改的 Python 2.7 二进制文件用于 运行 那,通过 -v
开关,我看到:
$ python2.7 -v test.py
# installing zipimport hook
import zipimport # builtin
# installed zipimport hook
# /..../lib/python2.7/site.pyc matches /..../lib/python2.7/site.py
import site # precompiled from /..../lib/python2.7/site.pyc
# /..../lib/python2.7/os.pyc matches /..../lib/python2.7/os.py
import os # precompiled from /..../lib/python2.7/os.pyc
import errno # builtin
import posix # builtin
# /..../lib/python2.7/posixpath.pyc matches /..../lib/python2.7/posixpath.py
import posixpath # precompiled from /..../lib/python2.7/posixpath.pyc
# /..../lib/python2.7/stat.pyc matches /..../lib/python2.7/stat.py
import stat # precompiled from /..../lib/python2.7/stat.pyc
# /..../lib/python2.7/genericpath.pyc matches /..../lib/python2.7/genericpath.py
import genericpath # precompiled from /..../lib/python2.7/genericpath.pyc
# /..../lib/python2.7/warnings.pyc matches /..../lib/python2.7/warnings.py
import warnings # precompiled from /..../lib/python2.7/warnings.pyc
# /..../lib/python2.7/linecache.pyc matches /..../lib/python2.7/linecache.py
import linecache # precompiled from /..../lib/python2.7/linecache.pyc
# /..../lib/python2.7/types.pyc matches /..../lib/python2.7/types.py
import types # precompiled from /..../lib/python2.7/types.pyc
# /..../lib/python2.7/UserDict.pyc matches /..../lib/python2.7/UserDict.py
import UserDict # precompiled from /..../lib/python2.7/UserDict.pyc
# /..../lib/python2.7/_abcoll.pyc matches /..../lib/python2.7/_abcoll.py
import _abcoll # precompiled from /..../lib/python2.7/_abcoll.pyc
# /..../lib/python2.7/abc.pyc matches /..../lib/python2.7/abc.py
import abc # precompiled from /..../lib/python2.7/abc.pyc
# /..../lib/python2.7/_weakrefset.pyc matches /..../lib/python2.7/_weakrefset.py
import _weakrefset # precompiled from /..../lib/python2.7/_weakrefset.pyc
import _weakref # builtin
# /..../lib/python2.7/copy_reg.pyc matches /..../lib/python2.7/copy_reg.py
import copy_reg # precompiled from /..../lib/python2.7/copy_reg.pyc
import encodings # directory /..../lib/python2.7/encodings
# /..../lib/python2.7/encodings/__init__.pyc matches /..../lib/python2.7/encodings/__init__.py
import encodings # precompiled from /..../lib/python2.7/encodings/__init__.pyc
# /..../lib/python2.7/codecs.pyc matches /..../lib/python2.7/codecs.py
import codecs # precompiled from /..../lib/python2.7/codecs.pyc
import _codecs # builtin
# /..../lib/python2.7/encodings/aliases.pyc matches /..../lib/python2.7/encodings/aliases.py
import encodings.aliases # precompiled from /..../lib/python2.7/encodings/aliases.pyc
# /..../lib/python2.7/encodings/utf_8.pyc matches /..../lib/python2.7/encodings/utf_8.py
import encodings.utf_8 # precompiled from /..../lib/python2.7/encodings/utf_8.pyc
Python 2.7.15 (default, May 7 2018, 17:08:03)
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
3
# -- clean-up output omitted --
其中的每个 import ...
行引用一个内置模块(Python 二进制文件的一部分,用 C 实现)或 .pyc
字节码缓存文件。在脚本代码 运行.
主脚本中的 3 行代码转换为另外 9 条字节码指令:
>>> import dis
>>> dis.dis(compile(r'''\
... X = 1
... Y = 2
... print X + Y
... ''', '', 'exec'))
2 0 LOAD_CONST 0 (1)
3 STORE_NAME 0 (X)
3 6 LOAD_CONST 1 (2)
9 STORE_NAME 1 (Y)
4 12 LOAD_NAME 0 (X)
15 LOAD_NAME 1 (Y)
18 BINARY_ADD
19 PRINT_ITEM
20 PRINT_NEWLINE
21 LOAD_CONST 2 (None)
24 RETURN_VALUE
(我忽略了最后的 2 个字节码,编码了一个额外的 return None
,这并不真正适用于模块)。