python访问OrderedDict的记录时出现Unicode解码错误

Question

在 windows (32) 上使用 python 3.5.2，我正在读取一个 DBF 文件，returns 我是一个 OrderedDict。

from dbfread import DBF
Table = DBF('FME.DBF')
for record in Table:
   print(record)

访问第一条记录时一切正常，直到我到达包含变音符号的记录：

Traceback (most recent call last):
  File "getdbe.py", line 3, in <module>
    for record in Table:
  File "...\AppData\Local\Programs\Python\Python35-32\lib\site-packages\dbfread\dbf.py", line 311, in _iter_records
    for field in self.fields]
  File "...\AppData\Local\Programs\Python\Python35-32\lib\site-packages\dbfread\dbf.py", line 311, in <listcomp>
    for field in self.fields]
  File "...\AppData\Local\Programs\Python\Python35-32\lib\site-packages\dbfread\field_parser.py", line 75, in parse
    return func(field, data)
  File "...\AppData\Local\Programs\Python\Python35-32\lib\site-packages\dbfread\field_parser.py", line 83, in parseC
    return decode_text(data.rstrip(b'[=12=] '), self.encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x82 in position 11: ordinal not in range(128)

即使我不打印记录我仍然有问题

有什么想法吗？

Answer 1

dbfread 未能从您的 DBF 文件中检测到正确的编码。来自 Character Encodings section of the documentation:

dbfread will try to detect the character encoding (code page) used in the file by looking at the language_driver byte. If this fails it reverts to ASCII. You can override this by passing encoding='my-encoding'.

强调我的。

您必须传入显式编码；这将始终是 Windows 代码页。看看 supported codecs in Python;您必须在此处使用以 cp 开头的。如果您不知道适合您的代码页，您将需要进行一些试错工作。请注意，某些代码页的字符重叠，因此即使代码页看起来可以产生清晰的结果，您也可能需要继续搜索并尝试数据文件中的不同记录以查看最合适的记录。

python访问OrderedDict的记录时出现Unicode解码错误

python Unicode decode error when accessing records of OrderedDict

python

unicode

ordereddictionary