CPython中函数对象与代码对象的关系

The function object and the code object relation in CPython

CPython源代码的Include/funcobject.h开始于下一条注释:

Function objects and code objects should not be confused with each other:

Function objects are created by the execution of the 'def' statement. They reference a code object in their __code__ attribute, which is a purely syntactic object, i.e. nothing more than a compiled version of some source code lines. There is one code object per source code "fragment", but each code object can be referenced by zero or many function objects depending only on how many times the 'def' statement in the source was executed so far.

我不太明白。


这里写下我的部分理解。可能有人完成它。

  1. 编译阶段.

    我们有源文件Test.py:

    def a_func():
        pass
    

    解释器解析它并创建两个 代码对象 - 一个用于 Test.py,一个用于 a_funcTest.py 代码对象有这样的 co_code 字段(反汇编):

      3           0 LOAD_CONST               0 (<code object a_func at 0x7f8975622b70, file "test.py", line 3>)
                  2 LOAD_CONST               1 ('a_func')
                  4 MAKE_FUNCTION            0
                  6 STORE_NAME               0 (a_func)
                  8 LOAD_CONST               2 (None)
                 10 RETURN_VALUE
    

    在此阶段没有创建函数对象。

  2. 执行阶段。

    • Function objects are created by the execution of the 'def' statement.

    当虚拟机到达MAKE_FUNCTION指令时,它创建函数对象:

    typedef struct {
        PyObject_HEAD
        PyObject *func_code;        /* A code object, the __code__ attribute */
        PyObject *func_globals;     /* A dictionary (other mappings won't do) */
        PyObject *func_defaults;    /* NULL or a tuple */
        PyObject *func_kwdefaults;  /* NULL or a dict */
        PyObject *func_closure;     /* NULL or a tuple of cell objects */
        PyObject *func_doc;         /* The __doc__ attribute, can be anything */
        PyObject *func_name;        /* The __name__ attribute, a string object */
        PyObject *func_dict;        /* The __dict__ attribute, a dict or NULL */
        PyObject *func_weakreflist; /* List of weak references */
        PyObject *func_module;      /* The __module__ attribute, can be anything */
        PyObject *func_annotations; /* Annotations, a dict or NULL */
        PyObject *func_qualname;    /* The qualified name */
    } PyFunctionObject;
    
    • They reference a code object in their __code__ attribute, which is a purely syntactic object, i.e. nothing more than a compiled version of some source code lines.

    并将 a_func 代码对象放入 PyObject *func_code 字段。现在,来自评论 "The function object and the code object are not the same" 的信息很清楚了。

    • There is one code object per source code "fragment", but each code object can be referenced by zero or many function objects depending only on how many times the 'def' statement in the source was executed so far.

    我不明白的部分用粗字体强调了。

如果我创建一个 lambda 工厂(出于范围原因,这是个好主意):

def mk_const(k):
  def const(): return k
  return const

然后mk_const有一个代码对象,const有一个代码对象,但是后者有很多函数对象调用mk_const(包括0)。

(用lambda没什么区别,但用def更容易解释。)

也可以是if的结果:

if lib.version>=4:
  def f(x): return lib.pretty(x)
else:
  def f(x): return str(x)  # fallback

这里有两个代码对象(加上模块的一个),但最多使用其中一个。