在 Python 中查找特定单词的句子索引(列表中的句子)

Find the sentence’s index (sentences in a list) of a specific word in Python

我目前有一个文件,其中包含一个看起来像

的列表
example = ['Mary had a little lamb' , 
       'Jack went up the hill' , 
       'Jill followed suit' ,    
       'i woke up suddenly' ,
       'it was a really bad dream...']

我想通过示例查找带有单词“woke”的句子的索引。 在这个例子中,答案应该是 f(“woke”)=3。 F 是一个函数。

我尝试对每个句子进行分词,以首先找到单词的索引:

>>> from nltk.tokenize import word_tokenize
>>> example = ['Mary had a little lamb' , 
...            'Jack went up the hill' , 
...            'Jill followed suit' ,    
...            'i woke up suddenly' ,
...            'it was a really bad dream...']
>>> tokenized_sents = [word_tokenize(i) for i in example]
>>> for i in tokenized_sents:
...     print i
... 
['Mary', 'had', 'a', 'little', 'lamb']
['Jack', 'went', 'up', 'the', 'hill']
['Jill', 'followed', 'suit']
['i', 'woke', 'up', 'suddenly']
['it', 'was', 'a', 'really', 'bad', 'dream', '...']

但我不知道如何最终得到单词的索引以及如何link它到句子的索引。有人知道怎么做吗?

for index, sentence in enumerate(tokenized_sents):
    if 'woke' in sentence:
        return index

对于所有的句子:

return [index for index, sentence in enumerate(tokenized_sets) if 'woke' in sentence]

如果要求是 return 第一个出现该词的句子,您可以使用 -

def func(strs, word):
    for idx, s in enumerate(strs):
        if s.find(word) != -1:
            return idx
example = ['Mary had a little lamb' , 
       'Jack went up the hill' , 
       'Jill followed suit' ,    
       'i woke up suddenly' ,
       'it was a really bad dream...']
func(example,"woke")

您可以遍历列表中的每个字符串,以白色拆分 space,然后查看您的搜索字词是否在该字词列表中。如果您在列表理解中执行此操作,则可以 return 满足此要求的字符串的索引列表。

def f(l, s):
    return [index for index, value in enumerate(l) if s in value.split()]

>>> f(example, 'woke')
[3]
>>> f(example, 'foobar')
[]
>>> f(example, 'a')
[0, 4]

如果您更喜欢使用 nltk

def f(l, s):
    return [index for index, value in enumerate(l) if s in word_tokenize(value)]