在 python 中使用 pop 时出现索引超出范围错误
Getting index out of range error on using pop in python
我正在尝试读取一个文本文件并从中删除所有停用词。但是,我在使用 b[i].pop(j)
时遇到索引超出范围错误。
但是,如果我使用 print(b[i][j])
,我不会收到任何错误并将单词作为输出。
谁能发现错误?
import nltk
from nltk.corpus import stopwords
stop = stopwords.words('english')
fo = open("text.txt", "r")
# text.txt is just a text document
list = fo.read();
list = list.replace("\n","")
# removing newline character
b = list.split('.',list.count('.'))
# splitting list into lines
for i in range (len(b) - 1) :
b[i] = b[i].split()
# splitting each line into words
for i in range (0,len(b)) :
for j in range (0,len(b[i])) :
if b[i][j] in stop :
b[i].pop(j)
# print(b[i][j])
#print(b)
# Close opend file
fo.close()
输出:
Traceback (most recent call last):
File "prog.py", line 29, in <module>
if b[i][j] in stop :
IndexError: list index out of range
评论 b[i].pop(j)
和取消评论 print(b[i][j])
的输出:
is
that
the
from
the
the
the
can
the
and
and
the
is
and
can
be
into
is
a
or
您在迭代时从列表中删除元素,这会导致列表在迭代期间缩小大小,但迭代仍会继续原始列表的长度,因此导致 InderError
问题。
您应该尝试创建一个仅包含所需元素的新列表。例子-
result = []
for i in range (0,len(b)):
templist = []
for j in range (0,len(b[i])):
if b[i][j] not in stop :
templist.append(b[i][j])
result.append(templist)
同样可以在列表理解中完成 -
result = [[word for word in sentence if word not in stop] for sentence in b]
我正在尝试读取一个文本文件并从中删除所有停用词。但是,我在使用 b[i].pop(j)
时遇到索引超出范围错误。
但是,如果我使用 print(b[i][j])
,我不会收到任何错误并将单词作为输出。
谁能发现错误?
import nltk
from nltk.corpus import stopwords
stop = stopwords.words('english')
fo = open("text.txt", "r")
# text.txt is just a text document
list = fo.read();
list = list.replace("\n","")
# removing newline character
b = list.split('.',list.count('.'))
# splitting list into lines
for i in range (len(b) - 1) :
b[i] = b[i].split()
# splitting each line into words
for i in range (0,len(b)) :
for j in range (0,len(b[i])) :
if b[i][j] in stop :
b[i].pop(j)
# print(b[i][j])
#print(b)
# Close opend file
fo.close()
输出:
Traceback (most recent call last):
File "prog.py", line 29, in <module>
if b[i][j] in stop :
IndexError: list index out of range
评论 b[i].pop(j)
和取消评论 print(b[i][j])
的输出:
is
that
the
from
the
the
the
can
the
and
and
the
is
and
can
be
into
is
a
or
您在迭代时从列表中删除元素,这会导致列表在迭代期间缩小大小,但迭代仍会继续原始列表的长度,因此导致 InderError
问题。
您应该尝试创建一个仅包含所需元素的新列表。例子-
result = []
for i in range (0,len(b)):
templist = []
for j in range (0,len(b[i])):
if b[i][j] not in stop :
templist.append(b[i][j])
result.append(templist)
同样可以在列表理解中完成 -
result = [[word for word in sentence if word not in stop] for sentence in b]