bytes.decode() 在 Python2 和 Python3 中

bytes.decode() in Python2 and Python3

sqlalchemy 的源代码中,我看到以下

    val = cursor.fetchone()[0]
    if util.py3k and isinstance(val, bytes):
        val = val.decode()

为什么我们只对 Python3 进行解码而不对 Python2 进行解码?

您可以查看详情documentation of string encoding frustration here

简而言之,由于 SQLAlchemy 包含将数据解析为字节数据的遗留 API,因此上述语句是将字符串字节数据迁移到 python 中的 Unicode 的简单方法 3.

Python 3中,"normal"字符串是Unicode(相对于 Python 2 它们所在的位置 (Extended) ASCII (或 ANSI)).根据[Python 3.Docs]: Unicode HOWTO - The String Type:

Since Python 3.0, the language’s str type contains Unicode characters, meaning any string created using "unicode rocks!", 'unicode rocks!', or the triple-quoted string syntax is stored as Unicode.

示例:

  • Python 3:

    >>> import sys
    >>> sys.version
    '3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)]'
    >>>
    >>> b = b"abcd"
    >>> s = "abcd"
    >>> u = u"abcd"
    >>>
    >>> type(b), type(s), type(u)
    (<class 'bytes'>, <class 'str'>, <class 'str'>)
    >>>
    >>> b.decode()
    'abcd'
    >>> s.decode()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    AttributeError: 'str' object has no attribute 'decode'
    >>> u.decode()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    AttributeError: 'str' object has no attribute 'decode'
    
  • Python 2:

    >>> import sys
    >>> sys.version
    '2.7.10 (default, Mar  8 2016, 15:02:46) [MSC v.1600 64 bit (AMD64)]'
    >>>
    >>> b = b"abcd"
    >>> s = "abcd"
    >>> u = u"abcd"
    >>>
    >>> type(b), type(s), type(u)
    (<type 'str'>, <type 'str'>, <type 'unicode'>)
    >>>
    >>> b.decode()
    u'abcd'
    >>> s.decode()
    u'abcd'
    >>> u.decode()
    u'abcd'
    

val 将作为 str[=45= 进一步传递(至 _parse_server_version) ].由于在Python3bytesstr中不同,所以进行转换