计算文本文件中重复单词的功能？ (Python)

Question

下面是问题：

问题 12

编写一个名为 repeatWords() 的函数。函数 repeatWords() 有两个字符串参数：输入文件的名称和输出文件的名称。输入文件只包含小写字母和白色 space。函数 repeatWords() 应该识别在文件中出现多次的单词并将每个这样的词写入输出文件的一行，然后是出现的次数词出现。一个重复的单词应该只写入输出文件的一行，无论它在输入文件中出现了多少次。单词写入输出文件的顺序不要紧。例如，如果输入文件包含以下行：

i would not like them here or there

i would not like them anywhere

那么包含以下行的输出文件将是正确的：

like 2

not 2

i 2

would 2

them 2

这是我的代码。在查找计数器中的数字是否大于 1（大于或等于 2）时，我只是不知道如何获得 if 语句。

def repeatWords(inFile, outFile):
from collections import Counter
outFile = open(outFile, 'w')
with open(inFile, 'r') as f:
    wordcount = Counter(f.read().split())
    for item in wordcount.items():
        if item >= 2:
            outFile.write("{} {}".format(*item))
print(repeatWords('input.txt','output.txt'))

此外，我还没有在代码中开始我只需要计算重复单词的部分

Answer 1

item 是一个元组 (word, count) 因此您需要比较计数而不是 if item >= 2:：

if item[1] >= 2:

或者，您可以解压缩 items() 返回的元组，这将使代码更具可读性：

for word, count in wordcount.items():
    if count >= 2:
        print("{} {}".format(word, count))

计算文本文件中重复单词的功能？ (Python)

Function to count repeated words in a text file? (Python)

python

text

counter

python-3.4