为什么字节与 str 的比较在 Python 3 中失败？

Question

在 Python 3 中，此表达式的计算结果为 False:

b"" == ""

而在 Python 2 中，此比较是 True:

u"" == ""

使用 is 检查身份显然在这两种情况下都失败了。

但是他们为什么要在 Python 3 中实现这样的行为？

Answer 1

设计者决定在将字节与字符串进行比较时不采用强制编码，因此它属于 Python 3.x 的默认行为，即包含不同类型的比较失败。

Answer 2

在 Python 3 中，字符串是 Unicode。用于保存文本的类型是 str，用于保存数据的类型是 bytes.

the str and bytes types cannot be mixed, you must always explicitly convert between them. Use str.encode() to go from str to bytes, and bytes.decode() to go from bytes to str.

因此，如果您执行 b"".decode() == ""，您将得到 True:

>>> b"".decode() == ""
True

有关详细信息，请阅读 Text Vs. Data Instead Of Unicode Vs. 8-bi。

Answer 3

在Python2.x中，Unicode的设计目标是通过在两种类型之间隐式转换来实现Unicode和字节串之间的透明操作。

进行比较时 u"" == ""，统一码 LHS is automatically encoded into a byte string first, and then compared to the str RHS。这就是它返回 True.

的原因

相比之下，Python 3.x 从 Python 2 中的 Unicode 混乱中吸取了教训，决定将有关 Unicode 与字节字符串的所有内容都明确化。因此，b"" == ""是False，因为字节串不再自动转换为Unicode进行比较。

为什么字节与 str 的比较在 Python 3 中失败？

Why does comparison of bytes with str fails in Python 3?

python

python-2.x

python-3.x