Python 究竟是如何找到“new”并选择其参数的？

Question

在尝试实现一些我不想在这里涉及的深层魔法时（如果我得到答案，我应该能够弄清楚），我突然想到 __new__ 没有对于定义它的 classes 和没有定义它的 classes 来说，它们的工作方式是一样的。具体来说：当您自己定义 __new__ 时，它将传递反映 __init__ 的参数，但默认实现不接受任何参数。这是有道理的，因为 object 是一个内置类型，它本身不需要这些参数。

但是，它会导致以下行为，我觉得这很烦人：

>>> class example:
...     def __init__(self, x): # a parameter other than `self` is necessary to reproduce
...         pass
>>> example(1) # no problem, we can create instances.
<__main__.example object at 0x...>
>>> example.__new__ # it does exist:
<built-in method __new__ of type object at 0x...>
>>> old_new = example.__new__ # let's store it for later, and try something evil:
>>> example.__new__ = 'broken'
>>> example(1) # Okay, of course that will break it...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object is not callable
>>> example.__new__ = old_new # but we CAN'T FIX IT AGAIN
>>> example(1) # the argument isn't accepted any more:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object.__new__() takes exactly one argument (the type to instantiate)
>>> example() # But we can't omit it either due to __init__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() missing 1 required positional argument: 'x'

好的，但这只是因为我们仍然有一些明确附加到 example 的东西，所以它隐藏了默认值，这破坏了一些描述符……对吧？ 除了不:

>>> del example.__new__ # if we get rid of it, the problem persists
>>> example(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object.__new__() takes exactly one argument (the type to instantiate)
>>> assert example.__new__ is old_new # even though the lookup gives us the same object!

如果我们直接添加和删除属性，而不在两者之间替换它，同样的事情仍然会发生。简单地分配和删除属性会破坏 class，显然是不可撤销的，并且无法实例化。就好像 class 有一些隐藏的属性告诉它如何调用 __new__，它已经被默默地破坏了。

当我们一开始实例化example时，Python实际上是如何找到基数__new__的（它显然找到了object.__new__，但它是直接在object？通过 type 间接到达那里？还有别的吗？），以及它如何决定这个 __new__ 应该在没有参数的情况下被调用，即使如果我们写一个 __new__ 方法里面的 class?如果我们暂时弄乱 class' __new__，即使我们恢复所有内容以致没有可观察到的净变化，为什么逻辑会中断？

Answer 1

您看到的问题与 Python 如何找到 __new__ 或选择其参数无关。 __new__ 接收您传递的每个参数。您观察到的效果来自 object.__new__ 中的特定代码，以及更新 C-level tp_new 插槽的逻辑错误。

Python 将参数传递给 __new__ 并没有什么特别之处。特别之处在于 object.__new__ 对这些参数的处理。

object.__new__ 和 object.__init__ 需要一个参数，class 为 __new__ 实例化，对象为 __init__ 初始化。如果他们收到任何额外的参数，他们将忽略额外的参数或抛出异常，具体取决于哪些方法已被覆盖：

如果 class 恰好覆盖 __new__ 或 __init__ 之一，non-overridden object 方法应该忽略额外的参数，所以人们不会被迫覆盖两者。
如果 subclass __new__ 或 __init__ 显式地将额外参数传递给 object.__new__ 或 object.__init__，object 方法应该引发一个例外。
如果 __new__ 和 __init__ 都没有被覆盖，两个 object 方法都应该为额外的参数抛出异常。

源代码中有一个big comment在谈论这个。

在 C 级别，__new__ 和 __init__ 对应于 class 内存布局中的 tp_new 和 tp_init 函数指针槽。一般情况下，如果其中一个方法是用C实现的，slot会直接指向C-level实现，会生成一个Python方法对象包装C函数。如果方法在Python中实现，槽将指向slot_tp_new函数，该函数在MRO中搜索__new__方法对象并调用它。当实例化一个对象时，Python 将通过调用 tp_new 和 tp_init 函数指针来调用 __new__ 和 __init__。

object.__new__由object_newC-level函数实现，object.__init__由object_init实现。 object 的 tp_new 和 tp_init 插槽设置为指向这些函数。

object_new 和 object_init check 是否通过检查 class 的 tp_new 和 tp_init 插槽来覆盖它们。如果 tp_new 指向 object_new 以外的其他内容，则 __new__ 已被覆盖，与 tp_init 和 __init__.

类似

static PyObject *
object_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    if (excess_args(args, kwds)) {
        if (type->tp_new != object_new) {
            PyErr_SetString(PyExc_TypeError,
                            "object.__new__() takes exactly one argument (the type to instantiate)");
            return NULL;
        }
        ...

现在，当您分配或删除 __new__ 时，Python 必须更新 tp_new 槽以反映这一点。当您在 class 上分配 __new__ 时，Python 将 class 的 tp_new 槽设置为通用 slot_tp_new 函数，该函数搜索__new__ 方法并调用它。当你删除 __new__时，class应该从superclassre-inherittp_new，但是代码有错误：

else if (Py_TYPE(descr) == &PyCFunction_Type &&
         PyCFunction_GET_FUNCTION(descr) ==
         (PyCFunction)(void(*)(void))tp_new_wrapper &&
         ptr == (void**)&type->tp_new)
{
    /* The __new__ wrapper is not a wrapper descriptor,
       so must be special-cased differently.
       If we don't do this, creating an instance will
       always use slot_tp_new which will look up
       __new__ in the MRO which will call tp_new_wrapper
       which will look through the base classes looking
       for a static base and call its tp_new (usually
       PyType_GenericNew), after performing various
       sanity checks and constructing a new argument
       list.  Cut all that nonsense short -- this speeds
       up instance creation tremendously. */
    specific = (void *)type->tp_new;
    /* XXX I'm not 100% sure that there isn't a hole
       in this reasoning that requires additional
       sanity checks.  I'll buy the first person to
       point out a bug in this reasoning a beer. */
}

在 specific = (void *)type->tp_new; 行中，type 是错误的类型 - 它是我们要更新其插槽的 class，而不是我们要更新的 class应该继承 tp_new from.

当此代码找到用 C 编写的 __new__ 方法时，它不会更新 tp_new 以指向相应的 C 函数，而是将 tp_new 设置为它已有的任何值！它根本没有改变 tp_new！

所以最初，您的 example class 将 tp_new 设置为 object_new，并且 object_new 忽略了额外的参数，因为它看到 __init__ 被覆盖而 __new__ 不是。

当您设置 example.__new__ = 'broken' 时，Python 会将 example 的 tp_new 设置为 slot_tp_new。在那之后你做的任何事情都不会改变 tp_new 到任何其他东西，即使 del example.__new__ 真的应该改变。

当object_new发现example的tp_new是slot_tp_new而不是object_new时，它拒绝额外的参数并抛出异常。

该错误还以其他一些方式出现。例如，

>>> class Example: pass
... 
>>> Example.__new__ = tuple.__new__
>>> Example()
<__main__.Example object at 0x7f9d0a38f400>

在 __new__ 赋值之前，Example 已将 tp_new 设置为 object_new。当示例做Example.__new__ = tuple.__new__时，Python发现tuple.__new__是用C实现的，所以更新tp_new失败，将其设置为object_new。那么，在Example(1, 2, 3)、tuple.__new__中，应该抛出异常，因为tuple.__new__不适用于Example:

>>> tuple.__new__(Example)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: tuple.__new__(Example): Example is not a subtype of tuple

但是因为 tp_new 仍然设置为 object_new，所以调用 object_new 而不是 tuple.__new__。

开发人员已尝试修复错误跟踪器中的错误代码 several times, but each fix was itself buggy and got reverted. The second attempt got closer, but broke multiple inheritance - see the conversation。

Python 究竟是如何找到“new”并选择其参数的？

How exactly does Python find `new` and choose its arguments?

python

metaprogramming

Python 究竟是如何找到“__new__”并选择其参数的？

How exactly does Python find `__new__` and choose its arguments?

python

metaprogramming

Python 究竟是如何找到“new”并选择其参数的？

How exactly does Python find `new` and choose its arguments?