如何处理此类错误?

How to handle such errors?

companies = pd.read_csv("http://www.richard-muir.com/data/public/csv/CompaniesRevenueEmployees.csv", index_col = 0)

companies.head()

我收到此错误,请建议应该尝试哪些方法。

"utf-8' codec can't decode byte 0xb7 in position 7"

下载文件并在 notepad++ 中打开显示它是 ansi 编码的。如果您使用的是 windows 系统,这应该可以修复它:

import pandas as pd

url = "http://www.richard-muir.com/data/public/csv/CompaniesRevenueEmployees.csv"

companies = pd.read_csv(url, index_col = 0, encoding='ansi')

print(companies)

如果不是(在 windows),您需要研究如何将 ansi 编码的文本转换为您可以阅读的内容。

参见:https://docs.python.org/3/library/codecs.html#standard-encodings

输出:

                                       Name              Industry  \
0                                   Walmart                Retail
1                             Sinopec Group           Oil and gas
2      China National Petroleum Corporation           Oil and gas
...                                     ...                   ...
47               Hewlett Packard Enterprise           Electronics
48                               Tata Group          Conglomerate

    Revenue (USD billions)  Employees
0                      482    2200000
1                      455     358571
2                      428    1636532
...                    ...        ...
47                     111     302000
48                     108     600000

尝试在 macOS 上编码为 'latin1'

companies = pd.read_csv("http://www.richardmuir.com/data/public/csv/CompaniesRevenueEmployees.csv",
                        index_col=0,
                        encoding='latin1')