在 Python 中查找特定单词的句子索引(列表中的句子)
Find the sentence’s index (sentences in a list) of a specific word in Python
我目前有一个文件,其中包含一个看起来像
的列表
example = ['Mary had a little lamb' ,
'Jack went up the hill' ,
'Jill followed suit' ,
'i woke up suddenly' ,
'it was a really bad dream...']
我想通过示例查找带有单词“woke”的句子的索引。
在这个例子中,答案应该是 f(“woke”)=3。 F 是一个函数。
我尝试对每个句子进行分词,以首先找到单词的索引:
>>> from nltk.tokenize import word_tokenize
>>> example = ['Mary had a little lamb' ,
... 'Jack went up the hill' ,
... 'Jill followed suit' ,
... 'i woke up suddenly' ,
... 'it was a really bad dream...']
>>> tokenized_sents = [word_tokenize(i) for i in example]
>>> for i in tokenized_sents:
... print i
...
['Mary', 'had', 'a', 'little', 'lamb']
['Jack', 'went', 'up', 'the', 'hill']
['Jill', 'followed', 'suit']
['i', 'woke', 'up', 'suddenly']
['it', 'was', 'a', 'really', 'bad', 'dream', '...']
但我不知道如何最终得到单词的索引以及如何link它到句子的索引。有人知道怎么做吗?
for index, sentence in enumerate(tokenized_sents):
if 'woke' in sentence:
return index
对于所有的句子:
return [index for index, sentence in enumerate(tokenized_sets) if 'woke' in sentence]
如果要求是 return 第一个出现该词的句子,您可以使用 -
def func(strs, word):
for idx, s in enumerate(strs):
if s.find(word) != -1:
return idx
example = ['Mary had a little lamb' ,
'Jack went up the hill' ,
'Jill followed suit' ,
'i woke up suddenly' ,
'it was a really bad dream...']
func(example,"woke")
您可以遍历列表中的每个字符串,以白色拆分 space,然后查看您的搜索字词是否在该字词列表中。如果您在列表理解中执行此操作,则可以 return 满足此要求的字符串的索引列表。
def f(l, s):
return [index for index, value in enumerate(l) if s in value.split()]
>>> f(example, 'woke')
[3]
>>> f(example, 'foobar')
[]
>>> f(example, 'a')
[0, 4]
如果您更喜欢使用 nltk
库
def f(l, s):
return [index for index, value in enumerate(l) if s in word_tokenize(value)]
我目前有一个文件,其中包含一个看起来像
的列表example = ['Mary had a little lamb' ,
'Jack went up the hill' ,
'Jill followed suit' ,
'i woke up suddenly' ,
'it was a really bad dream...']
我想通过示例查找带有单词“woke”的句子的索引。 在这个例子中,答案应该是 f(“woke”)=3。 F 是一个函数。
我尝试对每个句子进行分词,以首先找到单词的索引:
>>> from nltk.tokenize import word_tokenize
>>> example = ['Mary had a little lamb' ,
... 'Jack went up the hill' ,
... 'Jill followed suit' ,
... 'i woke up suddenly' ,
... 'it was a really bad dream...']
>>> tokenized_sents = [word_tokenize(i) for i in example]
>>> for i in tokenized_sents:
... print i
...
['Mary', 'had', 'a', 'little', 'lamb']
['Jack', 'went', 'up', 'the', 'hill']
['Jill', 'followed', 'suit']
['i', 'woke', 'up', 'suddenly']
['it', 'was', 'a', 'really', 'bad', 'dream', '...']
但我不知道如何最终得到单词的索引以及如何link它到句子的索引。有人知道怎么做吗?
for index, sentence in enumerate(tokenized_sents):
if 'woke' in sentence:
return index
对于所有的句子:
return [index for index, sentence in enumerate(tokenized_sets) if 'woke' in sentence]
如果要求是 return 第一个出现该词的句子,您可以使用 -
def func(strs, word):
for idx, s in enumerate(strs):
if s.find(word) != -1:
return idx
example = ['Mary had a little lamb' ,
'Jack went up the hill' ,
'Jill followed suit' ,
'i woke up suddenly' ,
'it was a really bad dream...']
func(example,"woke")
您可以遍历列表中的每个字符串,以白色拆分 space,然后查看您的搜索字词是否在该字词列表中。如果您在列表理解中执行此操作,则可以 return 满足此要求的字符串的索引列表。
def f(l, s):
return [index for index, value in enumerate(l) if s in value.split()]
>>> f(example, 'woke')
[3]
>>> f(example, 'foobar')
[]
>>> f(example, 'a')
[0, 4]
如果您更喜欢使用 nltk
库
def f(l, s):
return [index for index, value in enumerate(l) if s in word_tokenize(value)]