解码其他语言

Question

在 python 中，我将其他语言文本设置为，

import json

name = "அரவிந்த்"

result = {"Name": name}
j_res = json.dumps(result)
print j_res

输出：

{"Name": "\u0b85\u0bb0\u0bb5\u0bbf\u0ba8\u0bcd\u0ba4\u0bcd"}

有什么方法可以从\u0b85\u0bb0\u0bb5\u0bbf\u0ba8\u0bcd\u0ba4\u0bcd这篇文章中得到அரவிந்த்的名字吗？

Answer 1

是的，就这么简单：

# -*- coding: utf-8 -*-

import json

name = "அரவிந்த்"

result = {"Name": name}
j_res = json.dumps(result)

print j_res
print json.loads(j_res)
print json.loads(j_res)["Name"]

Output:

{"Name": "\u0b85\u0bb0\u0bb5\u0bbf\u0ba8\u0bcd\u0ba4\u0bcd"}
{u'Name': u'\u0b85\u0bb0\u0bb5\u0bbf\u0ba8\u0bcd\u0ba4\u0bcd'}
அரவிந்த்

Answer 2

在 Python 2.7 中，字符串只是 ASCII charset (0 through 255 bits) .. if you need to handle and show characters beyond these 256 characters, you should almost-certainly use unicode objects (prefixed by u) 的简单集合，而不是天真的 str（新字符串的默认值）。

在 Python 3+ 中，这个问题通过字符串解决，字符串是具有关联编码（通常是 utf-8）的原始字节数组，它可以表示编码中找到的所有类型的字符。如果你可以使用 Python 3，它可能会解决这个问题以及许多与如何为你保存和显示字符串和字符相关的类似问题。

如果你被迫使用 Python 2.7，你应该 read these with an encoding and make certain they're loaded as unicode

解码其他语言

Decode other languages

unicode

python-2.7