Python 编码问题（可能是从 windows 到 linux 的问题）

Question

我正在开发一个在 windows 下用 python 编写的程序。它正在读取 cvs 文件。这是代码的一部分：

with open(os.path.abspath(self.currencies_file_path), 'r') as f:
    reader = csv.reader(f)
    #for each row find whether such isocode exists in the table
    for row in reader:   #THis is line 49

这是错误：

  File "whatever/staticdata.py", line 49, in upload_currencies
    for row in reader:
  File "/usr/lib/python3.4/codecs.py", line 313, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf3 in position 1307: invalid continuation byte

csv 文件甚至没有用 utf-8 编码（我认为）。为什么我会遇到这种问题？

P.S。我对编码一无所知。

Answer 1

查看文件编码，可以使用文件命令：

$ file utils.py
utils.py: Python script, UTF-8 Unicode text executable

要转换文件，可以使用iconv命令：

iconv -f ascii -t utf-8 utils.py -o utils.utf8.py

选项：-f：来自编码； -t：to-编码； -o 输出文件。

最后但同样重要的是，明确声明编码（在 shebang 的右上角）：

# -*- coding: utf-8 -*-

因此，对于一个工作示例，您将有类似的内容：

#/usr/bin/env python
# -*- coding: utf-8 -*-

对于 iconv 支持的编码列表，您可以键入：

iconv -l

Answer 2

如果您认为它是 latin-1，试试这个：

import io
with io.open(os.path.abspath(self.currencies_file_path), encoding='latin-1') as f:
    reader = csv.reader(f)
    for row in reader:

Answer 3

Windows 可能正在使用 CP-1252。

无法 100% 地知道文件使用的是哪种编码，请参阅 this Whosebug question for reference. If you are using Python3 , just specify the encoding to use when opening the file. If you're using Python 2, you can use io.open to specify an encoding 以使用。

Python 编码问题（可能是从 windows 到 linux 的问题）

Python encoding issue (possibly from windows to linux issue)

python

linux

csv

encoding