如何将 unicode 字符的确切措辞写入文件？

Question

当我想使用 python3.6 将“Sivu Yoo Patalamalava”的确切措辞写入 json 文件时，而是将 \u0dc3\u0dd2\u0dc3\u0dd4\u0db1\u0dca\u0da7 \u0dc3\u0dd2\u0dc0\u0dd4 写入 [=21] =] 文件.

我使用 xlrd 读取 excel 并使用 open() 写入。

import xlrd 
import json

wb = xlrd.open_workbook('data.xlsx',encoding_override='utf-8') 
sheet = wb.sheet_by_index(0) 

with open('data.json', 'w') as outfile:
    data = json.dump(outerdata,outfile,ensure_ascii=True)

Answer 1

如果我在 Python 中使用您报告的转义字符串执行此操作：

>>> print ("\u0dc3\u0dd2\u0dc3\u0dd4\u0db1\u0dca\u0da7 \u0dc3\u0dd2\u0dc0\u0dd4")
සිසුන්ට සිවු

您会看到转义符确实呈现为您想要的字符。这是同一数据的两种不同表示。两种表示在 JSON 中均有效。但是您正在使用 json.dump() 并且您指定了 ensure_ascii=True。这告诉 json.dump() 您想要带有转义符的表示。这就是 ascii 的意思：只有 chr(32) 和 chr(126) 之间的可打印字符。将其更改为 ensure_ascii=False.

但是因为您现在不再将纯 ascii 写入输出文件data.json，所以您需要在打开它时指定编码：

with open("data.json", "w", encoding="utf-8") as outfile:
    data = json.dump(outerdata,outfile,ensure_ascii=False)

这将使您的 JSON 文件看起来像您想要的样子。

如何将 unicode 字符的确切措辞写入文件？

How to write exact wordings of unicode characters into a file?

python

xlrd