列出超出嵌套 for 循环范围的索引
list index out of range for nested for loop
这是我的df
我在下面创建了一个函数来根据评论的词性标签获取三元组。
def get_trigram(pos_1, pos_2, pos_3):
all_trigram = []
for j in range(len(df)):
trigram = []
for i in range(len(df['pos'][j]['pos'])):
if [value for value in df['pos'][j]['pos']][i-2] == pos_1 and [value for value in df['pos'][j]['pos']][i-1] == pos_2 and [value for value in df['pos'][j]['pos']][i] == pos_3:
trigram.append([value for value in df['pos'][j]['word']][i-2] + " " + [value for value in df['pos'][j]['word']][i-1] + " " + [value for value in df['pos'][j]['word']][i])
all_trigram.append(trigram)
return all_trigram
运行函数没有错误,但是当我调用我的函数时
tri_adv_adj_noun = get_trigram('ADV', 'ADJ', 'NOUN')
报错:IndexError: list index out of range
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-149-12b4d4ffff3d> in <module>()
----> 1 tri_adv_adj_noun = get_trigram('ADV', 'ADJ', 'NOUN')
2 tri_noun_adv_adj = get_trigram('NOUN', 'ADV', 'ADJ')
3
4 trigram = tri_adv_adj_noun + tri_noun_adv_adj
<ipython-input-148-60ed39e749d0> in get_trigram(pos_1, pos_2, pos_3)
8 for i in range(len(df_long['pos'][j]['pos'])):
9
---> 10 if [value for value in df_long['pos'][j]['pos']][i-2] == pos_1 and [value for value in df_long['pos'][j]['pos']][i-1] == pos_2 and [value for value in df_long['pos'][j]['pos']][i] == pos_3:
11 trigram.append([value for value in df_long['pos'][j]['word']][i-2] + " " + [value for value in df_long['pos'][j]['word']][i-1] + " " + [value for value in df_long['pos'][j]['word']][i])
12
IndexError: list index out of range
仅供参考,
df['pos'][0] returns 2 个列表的字典
我假设你的问题出在
部分
[value for value in df_long['pos'][j]['pos']][i-2]
首先,可能是 'pos' 列中的某些 'pos' 字典数据丢失了,在这种情况下,您应该设置一个条件,首先验证字典是否填充有数据。否则,当访问一个元素少于你正在搜索的索引值的列表时,你会得到那个错误(例如,i-2 将从列表的末尾返回 2 个位置,而当它没有找不到足够的元素返回,它会抛出“列表索引超出范围”错误)
例如:
if len(df['pos'][j]['pos']) >= 3:
for i in range(len(df['pos'][j]['pos']):
...
其次,像这样编写代码是多余的,因为您要使用列表中的数据创建列表。你可以写:
if df_long['pos'][j]['pos'][i-2] == pos_1 and df_long['pos'][j]['pos'][i-1] == pos_2 etc..
或者通过添加具有描述性名称的变量来进一步提高它的可见性:
for j in range(len(df)):
trigram = []
pos_list = df['pos'][j]['pos']
if len(post_list) >= 3:
for i in range(len(pos_list)):
if pos_list[i-2] == pos_1 and pos_list[i-1] == pos_2 ...
希望对您有所帮助!
这是我的df
我在下面创建了一个函数来根据评论的词性标签获取三元组。
def get_trigram(pos_1, pos_2, pos_3):
all_trigram = []
for j in range(len(df)):
trigram = []
for i in range(len(df['pos'][j]['pos'])):
if [value for value in df['pos'][j]['pos']][i-2] == pos_1 and [value for value in df['pos'][j]['pos']][i-1] == pos_2 and [value for value in df['pos'][j]['pos']][i] == pos_3:
trigram.append([value for value in df['pos'][j]['word']][i-2] + " " + [value for value in df['pos'][j]['word']][i-1] + " " + [value for value in df['pos'][j]['word']][i])
all_trigram.append(trigram)
return all_trigram
运行函数没有错误,但是当我调用我的函数时
tri_adv_adj_noun = get_trigram('ADV', 'ADJ', 'NOUN')
报错:IndexError: list index out of range
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-149-12b4d4ffff3d> in <module>()
----> 1 tri_adv_adj_noun = get_trigram('ADV', 'ADJ', 'NOUN')
2 tri_noun_adv_adj = get_trigram('NOUN', 'ADV', 'ADJ')
3
4 trigram = tri_adv_adj_noun + tri_noun_adv_adj
<ipython-input-148-60ed39e749d0> in get_trigram(pos_1, pos_2, pos_3)
8 for i in range(len(df_long['pos'][j]['pos'])):
9
---> 10 if [value for value in df_long['pos'][j]['pos']][i-2] == pos_1 and [value for value in df_long['pos'][j]['pos']][i-1] == pos_2 and [value for value in df_long['pos'][j]['pos']][i] == pos_3:
11 trigram.append([value for value in df_long['pos'][j]['word']][i-2] + " " + [value for value in df_long['pos'][j]['word']][i-1] + " " + [value for value in df_long['pos'][j]['word']][i])
12
IndexError: list index out of range
仅供参考,
df['pos'][0] returns 2 个列表的字典
我假设你的问题出在
部分[value for value in df_long['pos'][j]['pos']][i-2]
首先,可能是 'pos' 列中的某些 'pos' 字典数据丢失了,在这种情况下,您应该设置一个条件,首先验证字典是否填充有数据。否则,当访问一个元素少于你正在搜索的索引值的列表时,你会得到那个错误(例如,i-2 将从列表的末尾返回 2 个位置,而当它没有找不到足够的元素返回,它会抛出“列表索引超出范围”错误) 例如:
if len(df['pos'][j]['pos']) >= 3:
for i in range(len(df['pos'][j]['pos']):
...
其次,像这样编写代码是多余的,因为您要使用列表中的数据创建列表。你可以写:
if df_long['pos'][j]['pos'][i-2] == pos_1 and df_long['pos'][j]['pos'][i-1] == pos_2 etc..
或者通过添加具有描述性名称的变量来进一步提高它的可见性:
for j in range(len(df)):
trigram = []
pos_list = df['pos'][j]['pos']
if len(post_list) >= 3:
for i in range(len(pos_list)):
if pos_list[i-2] == pos_1 and pos_list[i-1] == pos_2 ...
希望对您有所帮助!