如何解码 unicode 字符串 Python

Question

解码如下所示的编码字符串的最佳方法是什么：u'u\xf1somestring' ?

背景：我有一个包含随机值（字符串和整数）的列表，我试图将列表中的每个项目都转换为一个字符串，然后处理每个项目。

原来有些项目的格式是：u'u\xf1somestring' 当我尝试转换为字符串时，出现错误：UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 1: ordinal not in range(128)

我试过了

item = u'u\xf1somestring'
decoded_value = item.decode('utf-8', 'ignore')

但是，我总是遇到同样的错误。

我已经阅读了有关 unicode 字符的内容，并尝试了 SO 的一些建议，但 none 到目前为止都有效。我在这里错过了什么吗？

Answer 1

您需要调用 encode 函数而不是 decode 函数，因为 item 已经解码。

像这样：

decoded_value = item.encode('utf-8')

Answer 2

该字符串已经解码（它是一个 Unicode 对象）。如果您想将其存储在文件中（或将其发送到哑终端等），则需要对其进行编码。

通常，在使用 Unicode 时，您应该（在 Python 2 中）在工作流程的早期解码所有字符串（您似乎已经这样做了；许多处理互联网流量的图书馆已经这样做了为你），然后在 Unicode 对象上完成所有工作，然后在最后，当写回它们时，将它们编码为你正在使用的任何编码。

如何解码 unicode 字符串 Python

How to decode a unicode string Python

string

unicode

encode

decode

python-2.7