在字符串列表中查找给定单词被提及次数最多的函数？

Question

我想创建一个函数，将字符串列表作为输入，对于给定的单词，returns 是一个元组，其中包含对给定单词提及次数最多的字符串和字符串中提到。如果多个字符串都具有相同的最大提及次数，那么这些字符串中第一个出现的是 returned。 单词不区分大小写。

例如，考虑列表：

Tomatoes = ['tonight tomatoes grow towards the torchlit tower',
 'the birds fly to the sky',
 'to take the fish to the sea and to tell the tale',
 'to fly to the skies and to taste the clouds']

请注意，第 3 行和第 4 行提及次数最多的单词 'to'。当我们将 tomatoes 放入带有搜索词 ‘to’ 的函数中时，它应该如下所示：

most_word_mentions(Tomatoes, ‘to’)

并且它应该 return 字符串中的第三行和提及“to”的数量作为一个元组应该看起来像 (3, 3)。虽然第 3 行与第 4 行共享相同数量的单词提及，但它是 returned，因为它出现在列表的第一位。

我创建了一个函数，可以部分实现我想要的功能，但在特定条件下它会失败。

def most_word_mentions(message, word):
    wordcount = []
    for i in range(len(message)):
        message[i] = message[i].lower() #word is not case sensitive
        wordcount.append(((message[i]).count(word)))
    return (wordcount.index(max(wordcount))+1), max(wordcount)

如果我们输入 most_word_mentions(Tomatoes, ‘to’)，那么该函数将无法输出正确的行和提及的单词。相反，它 returns (1, 6)。这是因为第 1 行虽然不包含明确的单词“to”，但包含许多其他带有“to”的单词。我想写一个函数来解决这个问题，并且可以应用于类似的场景。 是否可以仅使用 for 循环和 if 语句而不使用列表理解或导入来完成此操作？

Answer 1

这是我开始的一个，它的想法略有不同：

def mostCommon(word,sentences):
  sentencecount={} # keep track of sentences and occurance in sentence
  for item in sentences: #iterate through sentences
    lowcasesentence=item.lower().split() #make sentence lowercase, and split so that there is a list with each word of the sentence
    sentencecount[item]=lowcasesentence.count(word) #call the method "count", which counts all occurances in a list. Append that to sentencecount
  return(sentencecount) # return sentences as a dictionary with count as their value.

您不应遍历每个字母组合，而应将句子拆分为单词并进行计数。请注意，我的函数会返回所有句子，而不仅仅是单词数最多的句子。

Answer 2

此解决方案仅使用 for 循环和 if-statements，如您所愿。

def most_word_mentions(list_of_strings, word):
    word = word.lower()
    highest_count = 0
    earliest_index = 0
    for i in range(len(list_of_strings)):
        curr_count = 0
        if word in list_of_strings[i].lower():
            curr_count = list_of_strings[i].lower().count(word)
            if curr_count > highest_count:
              if i > earliest_index:
                earliest_index = i
                highest_count = curr_count
    return (earliest_index+1, highest_count)

print(most_word_mentions(Tomatoes, 'to'))

输出：

(3,3)

Answer 3

你可以试试这个。

def most_word_mentions(message, word):
    word = word.lower()
    line = 0
    most = 0
    for i in range(len(message)):
        count = 0
        for w in message[i].lower().split():
            if word == w:
                count += 1
        if count > most:
            line = i + 1
            most = count
    return (line, most)

在字符串列表中查找给定单词被提及次数最多的函数？

function that finds the most mentions of a given word in a list of strings?

python

for-loop

if-statement

function