'utf-8' 编解码器无法解码位置 2912 中的字节 0xd5:读取 Python 中的 csv 文件时出现无效连续字节错误
'utf-8' codec can't decode byte 0xd5 in position 2912: invalid continuation byte Error when reading csv file in Python
我正在循环浏览 csv 文件的行,但在循环浏览这些行时遇到此错误:
'utf-8' codec can't decode byte 0xd5 in position 2912: invalid continuation byte
我只是想用这个函数获取文件的行数:
def count_lines(filename):
row_stored = ""
try:
with open(filename) as csvfile:
data_reader = csv.reader(csvfile)
next(data_reader)
count = 0
for index, row in enumerate(data_reader):
if index == 1220119:
print(row)
row_stored = row
count += 1
return count
except Exception as e:
print(f'There was a problem with your request: {e}\n', row_stored)
return False
错误行上方的行如下所示:
['817949019495', 'QMMZN1300568', '4/28/2017', 'Digital Revenue', 'Track', 'Download Europe', 'GB', 'Amazon International - UK', '', '2', '1.2126506333579932', '109926407', '2/28/2017']
抛出错误的行如下所示:
['817949019495', 'QMMZN1300568', '4/28/2017', 'Digital Revenue', 'Track', 'Download Europe', 'GB', 'Amazon International - UK', '', '2', '1.2126506333579932', '109926407', '2/28/2017']
我看不出两者有什么不同。我没有看到此特定行的格式吗?
注意:此 csv 文件为 3.17 GB。不知道这是否是一个促成因素
更改编码解决了这个问题
with open(filename, encoding="ISO-8859-1") as csvfile:
我正在循环浏览 csv 文件的行,但在循环浏览这些行时遇到此错误:
'utf-8' codec can't decode byte 0xd5 in position 2912: invalid continuation byte
我只是想用这个函数获取文件的行数:
def count_lines(filename):
row_stored = ""
try:
with open(filename) as csvfile:
data_reader = csv.reader(csvfile)
next(data_reader)
count = 0
for index, row in enumerate(data_reader):
if index == 1220119:
print(row)
row_stored = row
count += 1
return count
except Exception as e:
print(f'There was a problem with your request: {e}\n', row_stored)
return False
错误行上方的行如下所示:
['817949019495', 'QMMZN1300568', '4/28/2017', 'Digital Revenue', 'Track', 'Download Europe', 'GB', 'Amazon International - UK', '', '2', '1.2126506333579932', '109926407', '2/28/2017']
抛出错误的行如下所示:
['817949019495', 'QMMZN1300568', '4/28/2017', 'Digital Revenue', 'Track', 'Download Europe', 'GB', 'Amazon International - UK', '', '2', '1.2126506333579932', '109926407', '2/28/2017']
我看不出两者有什么不同。我没有看到此特定行的格式吗?
注意:此 csv 文件为 3.17 GB。不知道这是否是一个促成因素
更改编码解决了这个问题
with open(filename, encoding="ISO-8859-1") as csvfile: