为什么 bytearray 不是 Python 2 中的序列?

Why is bytearray not a Sequence in Python 2?

我发现 Python 2 和 3 之间存在奇怪的行为差异。

在 Python 中,有 3 件事似乎工作正常:

Python 3.5.0rc2 (v3.5.0rc2:cc15d736d860, Aug 25 2015, 04:45:41) [MSC v.1900 32 b
it (Intel)] on win32
>>> from collections import Sequence
>>> isinstance(bytearray(b"56"), Sequence)
True

但不在 Python 2:

Python 2.7.10 (default, May 23 2015, 09:44:00) [MSC v.1500 64 bit (AMD64)] on wi
n32
>>> from collections import Sequence
>>> isinstance(bytearray("56"), Sequence)
False

Python 2.x 和 3.x 的次要版本的结果似乎是一致的。这是一个已知的错误?这是一个错误吗?这种差异背后有什么逻辑吗?

我实际上更担心 C API 函数 PySequence_Check 正确识别类型 PyByteArray_Type 的对象作为公开序列协议,通过查看源代码似乎就像它应该的那样,但是非常欢迎对这整个事情有任何见解。

collections 中提取 类 使用 ABCMeta.register(subclass)

Register subclass as a “virtual subclass” of this ABC.

在Python 3 issubclass(bytearray, Sequence) returns True 因为bytearray被显式注册为ByteString的子类(派生自Sequence) 和 MutableSequence。参见Lib/_collections_abc.py的相关部分:

class ByteString(Sequence):

    """This unifies bytes and bytearray.

    XXX Should add all their methods.
    """

    __slots__ = ()

ByteString.register(bytes)
ByteString.register(bytearray)
...
MutableSequence.register(bytearray)  # Multiply inheriting, see ByteString

Python 2 不这样做(来自 Lib/_abcoll.py):

Sequence.register(tuple)
Sequence.register(basestring)
Sequence.register(buffer)
Sequence.register(xrange)
...
MutableSequence.register(list)

此行为在 Python 3.0 中已更改(特别是在 this commit 中):

Add ABC ByteString which unifies bytes and bytearray (but not memoryview). There's no ABC for "PEP 3118 style buffer API objects" because there's no way to recognize these in Python (apart from trying to use memoryview() on them).

PEP 3119 中有更多信息:

This is a proposal to add Abstract Base Class (ABC) support to Python 3000. It proposes: [...] Specific ABCs for containers and iterators, to be added to the collections module.

Much of the thinking that went into the proposal is not about the specific mechanism of ABCs, as contrasted with Interfaces or Generic Functions (GFs), but about clarifying philosophical issues like "what makes a set", "what makes a mapping" and "what makes a sequence".

[...] a metaclass for use with ABCs that will allow us to add an ABC as a "virtual base class" (not the same concept as in C++) to any class, including to another ABC. This allows the standard library to define ABCs Sequence and MutableSequence and register these as virtual base classes for built-in types like basestring, tuple and list, so that for example the following conditions are all true: [...] issubclass(bytearray, MutableSequence).

仅供参考 memoryview 仅在 Python 3.4 中注册为 Sequence 的子类:

There's no ducktyping for this due to the Sequence/Mapping confusion so it's a simple missing explicit registration.

(详见 issue18690)。


PySequence_Check 来自 Python C API 不依赖于 collections 模块:

int
PySequence_Check(PyObject *s)
{
    if (PyDict_Check(s))
        return 0;
    return s != NULL && s->ob_type->tp_as_sequence &&
        s->ob_type->tp_as_sequence->sq_item != NULL;
}

它检查非零 tp_as_sequence 字段 (example for bytearray) and if that succeeds, for non-zero sq_item field (which is basically getitem - example for bytearray)。

当您查看 collections 抽象 classes 的源代码时,您会在 python3(文件 _collections_abc.py)的子class 中看到Sequence class , class ByteString, register 本身与 bytearray 而在 python2 (文件 _abcoll.py)那里没有 ByteString class 并且 Sequence 本身没有 register bytearray.

register 我的意思是抽象 class Sequence(或者它的子class ByteString)正在调用 abc.ABCMeta.register 方法正如在这个方法的描述中所说 Register subclass as a virtual subclass” of this ABC.

我认为这导致了 py2 和 py3 之间的不同行为,但恕我直言,这是错误(或更好地修复了 py3 中的错误)。