UnicodeEncodeError: 'ascii' codec can't encode character despite trying other SO solutions

UnicodeEncodeError: 'ascii' codec can't encode character despite trying other SO solutions

我正在尝试将 CSV 文件转换为 json 文件。在此过程中,当我尝试写入 json 文件时,我在中途收到有关 unicode 错误的错误:

UnicodeEncodeError: 'ascii' codec can't encode character u'\u06ec' in position 933: ordinal not in range(128)

我的代码:

import csv
import json
import codecs


csvfile = codecs.open('my.csv', 'r', encoding='utf-8', errors='ignore')
jsonfile = codecs.open('my.json',"w", encoding='utf-8',errors='ignore')

fieldnames = ("Title","Date","Text","Country","Page","Week")
reader = csv.DictReader(csvfile, fieldnames)
for row in reader:
    row['Text'] = row['Text'].encode('ascii',errors='ignore') #error occur on this line

    json.dump(row, jsonfile)
    jsonfile.write('\n')

一行示例:

{'Country': 'UK', 'Title': '12345', 'Text': "  hi there  hi john i currently ", 'Week': 'week2', 'Page': 'homepage', 'Date': '1/3/16'}

不要转换为 ASCII。

JSON 本机处理 unicode。 只需删除 .encode("ascii", ...) 部分。

此外,您不需要在用于 JSON 的文件对象上设置 encoding,因为 JSON 已经正确序列化了 unicode。

编辑我的代码以将 CSV 文件读取为二进制文件。然后它给了我另一个无效字节的问题,我通过将文本字符串转换为 unicode 解决了这个问题:

这是工作代码:

csvfile = open('my.csv', 'rb')
jsonfile = codecs.open('my.json',"w")

fieldnames = ("Title","Date","Text","Country","Page","Week")
reader = csv.DictReader(csvfile, fieldnames)
for row in reader:
    print row
    row['Text'] = unicode(row['Text'],errors='replace')