IndexError: list index out of range in a loop of readlines()
IndexError: list index out of range in a loop of readlines()
我不明白为什么这给了我 'IndexError: list index out of range'。
我正在阅读一个简单的 csv.file 并尝试获取以逗号分隔的值。
with open('project_twitter_data.csv','r') as twf:
tw = twf.readlines()[1:] # I dont need the very first line
for i in tw:
linelst = i.strip().split(",")
RT = linelst[1]
RP = linelst[2]
rows = "{}, {}".format(RT,RP)
我的输出看起来像这样
print(tw) # the original strings.
..\nBORDER Terrier puppy. Name is loving and very protective of the people she loves. Name2 is a 3 year old Maltipoo. Name3 is an 8 year old Corgi.,4,6\nREASON they did not rain but they will reign beautifully couldn't asked for a crime 80 years in the Spring Name's Last Love absolutely love,19,0\nHOME surrounded by snow in my Garden. But City Name people musn't: such a good book: RT @twitteruser The Literature of Conflicted Lands after a,0,0\n\n"
print (i)
..
BORDER Terrier puppy. Name is loving and very protective of the people she loves. Name2 is a 3 year old Maltipoo. Name3 is an 8 year old Corgi.,4,6
REASON they did not rain but they will reign beautifully couldn't asked for a crime 80 years in the Spring Name's Last Love absolutely love,19,0
HOME surrounded by snow in my Garden. But City Name people musn't: such a good book: RT @twitteruser The Literature of Conflicted Lands after a,0,0
print(linelst)
..
['BORDER Terrier puppy. Name is loving and very protective of the people she loves. Name2 is a 3 year old Maltipoo. Name3 is an 8 year old Corgi.', '4', '6']
["REASON they did not rain but they will reign beautifully couldn't asked for a crime 80 years in the Spring Name's Last Love absolutely love", '19', '0']
["HOME surrounded by snow in my Garden. But City Name people musn't: such a good book: RT @twitteruser The Literature of Conflicted Lands after a", '0', '0']
['']
print(rows)
..
4, 6
19, 0
0, 0
# the error
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-7-f27e87689f41> in <module>
6 linelst = i.strip().split(",")
7 # print(linelst)
----> 8 RT = linelst[1]
9 RP = linelst[2]
IndexError: list index out of range
我做错了什么?
我还注意到,在我使用 strip().split(",") 之后 [' '],我的列表的最后出现了一个空列表。
我可以用 twf.readlines()[1:][:-1] 删除它,但错误仍然存在。
谢谢你的建议。
你的最后一行在剥离后是空的,所以 split
生成一个 list
的空字符串。
最简单的解决方案是明确跳过空行:
with open('project_twitter_data.csv','r') as twf:
next(twf, None) # Advance past first line without needing to slurp whole file into memory and
# slice it, tying peak memory usage to max line size, not size of file
for line in twf:
line = line.strip()
if not line:
continue
linelst = line.split(",")
# If non-empty, but incomplete lines should be ignored:
if len(linelst) < 3:
continue
RT = linelst[1]
RP = linelst[2]
rows = "{}, {}".format(RT,RP)
或者更简单,使用 EAFP patterns and the csv
module,您在处理 CSV 文件时应该始终使用它(格式比“用逗号分隔”要复杂得多):
import csv
with open('project_twitter_data.csv', 'r', newline='') as twf: # newline='' needed for proper CSV dialect handling
csvf = csv.reader(twf)
next(csvf, None) # Advance past first row without needing to slurp whole file into memory and
# slice it, tying peak memory usage to max line size, not size of file
for row in csvf:
try:
RT, RP = row[1:3]
except ValueError:
continue # Didn't have enough elements, incomplete line
rows = "{}, {}".format(RT,RP)
注意:在这两种情况下,我都做了一些小改进以避免大型临时列表,并调整了一些小东西以提高可读性(将 str
变量命名为 i
是错误的形式;i
通常用于索引,或至少用于整数,并且您有一个现成的更清晰的名称,因此即使像 x
这样的占位符也不合适)。
我不明白为什么这给了我 'IndexError: list index out of range'。 我正在阅读一个简单的 csv.file 并尝试获取以逗号分隔的值。
with open('project_twitter_data.csv','r') as twf:
tw = twf.readlines()[1:] # I dont need the very first line
for i in tw:
linelst = i.strip().split(",")
RT = linelst[1]
RP = linelst[2]
rows = "{}, {}".format(RT,RP)
我的输出看起来像这样
print(tw) # the original strings.
..\nBORDER Terrier puppy. Name is loving and very protective of the people she loves. Name2 is a 3 year old Maltipoo. Name3 is an 8 year old Corgi.,4,6\nREASON they did not rain but they will reign beautifully couldn't asked for a crime 80 years in the Spring Name's Last Love absolutely love,19,0\nHOME surrounded by snow in my Garden. But City Name people musn't: such a good book: RT @twitteruser The Literature of Conflicted Lands after a,0,0\n\n"
print (i)
..
BORDER Terrier puppy. Name is loving and very protective of the people she loves. Name2 is a 3 year old Maltipoo. Name3 is an 8 year old Corgi.,4,6
REASON they did not rain but they will reign beautifully couldn't asked for a crime 80 years in the Spring Name's Last Love absolutely love,19,0
HOME surrounded by snow in my Garden. But City Name people musn't: such a good book: RT @twitteruser The Literature of Conflicted Lands after a,0,0
print(linelst)
..
['BORDER Terrier puppy. Name is loving and very protective of the people she loves. Name2 is a 3 year old Maltipoo. Name3 is an 8 year old Corgi.', '4', '6']
["REASON they did not rain but they will reign beautifully couldn't asked for a crime 80 years in the Spring Name's Last Love absolutely love", '19', '0']
["HOME surrounded by snow in my Garden. But City Name people musn't: such a good book: RT @twitteruser The Literature of Conflicted Lands after a", '0', '0']
['']
print(rows)
..
4, 6
19, 0
0, 0
# the error
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-7-f27e87689f41> in <module>
6 linelst = i.strip().split(",")
7 # print(linelst)
----> 8 RT = linelst[1]
9 RP = linelst[2]
IndexError: list index out of range
我做错了什么?
我还注意到,在我使用 strip().split(",") 之后 [' '],我的列表的最后出现了一个空列表。 我可以用 twf.readlines()[1:][:-1] 删除它,但错误仍然存在。 谢谢你的建议。
你的最后一行在剥离后是空的,所以 split
生成一个 list
的空字符串。
最简单的解决方案是明确跳过空行:
with open('project_twitter_data.csv','r') as twf:
next(twf, None) # Advance past first line without needing to slurp whole file into memory and
# slice it, tying peak memory usage to max line size, not size of file
for line in twf:
line = line.strip()
if not line:
continue
linelst = line.split(",")
# If non-empty, but incomplete lines should be ignored:
if len(linelst) < 3:
continue
RT = linelst[1]
RP = linelst[2]
rows = "{}, {}".format(RT,RP)
或者更简单,使用 EAFP patterns and the csv
module,您在处理 CSV 文件时应该始终使用它(格式比“用逗号分隔”要复杂得多):
import csv
with open('project_twitter_data.csv', 'r', newline='') as twf: # newline='' needed for proper CSV dialect handling
csvf = csv.reader(twf)
next(csvf, None) # Advance past first row without needing to slurp whole file into memory and
# slice it, tying peak memory usage to max line size, not size of file
for row in csvf:
try:
RT, RP = row[1:3]
except ValueError:
continue # Didn't have enough elements, incomplete line
rows = "{}, {}".format(RT,RP)
注意:在这两种情况下,我都做了一些小改进以避免大型临时列表,并调整了一些小东西以提高可读性(将 str
变量命名为 i
是错误的形式;i
通常用于索引,或至少用于整数,并且您有一个现成的更清晰的名称,因此即使像 x
这样的占位符也不合适)。