Python 中的字节到字符串的转换似乎没有按预期工作

Question

为什么在 Python 3 中会出现以下代码

print(str(b"Hello"))

输出 b'Hello' 而不是像常规文本字符串那样输出 Hello？从最相关的二进制字符串类型创建一个 str 对象似乎最终是明确的，很容易的，这是违反直觉的。

Answer 1

在 Python 3 中，bytes.__str__ 未定义 ，因此当您使用 str() 时，将使用 bytes.__repr__在物体上。请注意 print() 也对传入的对象调用 str()，因此这里的调用完全是多余的。

如果您需要文本，请显式解码：

print(b'Hello'.decode('ascii'))

str() type 可以显式处理 bytes 对象，但前提是（再次）您提供显式编解码器以首先解码字节：

print(str(b'Hello', 'ascii'))

文档对这种行为非常明确：

If neither encoding nor errors is given, str(object) returns object.__str__(), which is the “informal” or nicely printable string representation of object. For string objects, this is the string itself. If object does not have a __str__() method, then str() falls back to returning repr(object).

If at least one of encoding or errors is given, object should be a bytes-like object (e.g. bytes or bytearray). In this case, if object is a bytes (or bytearray) object, then str(bytes, encoding, errors) is equivalent to bytes.decode(encoding, errors).

和

Passing a bytes object to str() without the encoding or errors arguments falls under the first case of returning the informal string representation.

强调我的。

Answer 2

为什么要 "work"？字节对象是字节对象，它在 Python 3 中的字符串表示形式就是这种形式。您可以将其内容转换为适当的文本字符串（在 Python3 中 - 在 Python2 中将是 "unicode" 对象）您必须 decode它到文本。

为此你需要知道编码 -

改为尝试以下操作：

print(b"Hello".decode("latin-1"))

注意假定的 "latin-1" 文本编解码器，它将透明地将不在 ASCII 范围 (128-256) 中的代码转换为 unicode。它是 Windows 为西欧语言默认使用的编解码器。

"utf-8" 编解码器可以表示更大范围的字符，并且是国际文本的首选编码 - 但如果您的字节字符串不正确地由 utf-8 字符组成，您可能会遇到 UnicodeDecode 错误在过程中。

请阅读 http://www.joelonsoftware.com/articles/Unicode.html 以正确理解文本的内容。

Answer 3

事先，对不起我的英语...

嘿，几周前我遇到了这个问题。它像上面的人说的那样工作。如果解码过程的异常无关紧要，这里有一个提示。在这种情况下，您可以使用：

bytesText.decode(textEncoding, 'ignore')

例如：

>>> b'text \xab text'.decode('utf-8', 'ignore')  # Using UTF-8 is nice as you might know!
'text  text'                                     # As you can see, the « (\xab) symbol was
                                                 # ignored :D

Python 中的字节到字符串的转换似乎没有按预期工作

Bytes to string conversion in Python doesn't seem to work as expected

python

string

python-3.x