Python 统计 list2 中 list1 的元素出现次数

Python count element occurrence of list1 in list2

在下面的代码中,我想统计word_list中的每个单词在test中的出现次数,下面的代码可以做到这一点,但效率可能不高,有没有更好的怎么做?

word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]

result = [0] * len(word_list)
for i in range(len(word_list)):
    for w in test:
        if w == word_list[i]:
            result[i] += 1

print(result)

您可以使用字典在线性时间内完成。

word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]

result = []
word_map = {}
for w in test:
    if w in word_map:
        word_map[w] += 1
    else:
        word_map[w] = 1

for w in word_list:
    result.append(word_map.get(w, 0))

print(result)

使用collections.Counter一次计算test中的所有单词,然后从Counter中计算word_list中的每个单词。

>>> word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
>>> test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]
>>> counts = collections.Counter(test)
>>> [counts[w] for w in word_list]
[1, 0, 3, 0, 0]

或者用字典理解:

>>> {w: counts[w] for w in word_list}
{'perfect': 0, 'flawless': 0, 'good': 3, 'wonderful': 0, 'hello': 1}

创建计数器应该是 O(n),每次查找 O(1),对于 test 中的 n 个词和 word_list 中的 m 个词,给你 O(n+m)。

您可以组合 collections.Counter and operator.itemgetter:

from collections import Counter
from operator import itemgetter

cnts = Counter(test)
word_cnts = dict(zip(word_list, itemgetter(*word_list)(cnts)))

给出:

>>> word_cnts
{'flawless': 0, 'good': 3, 'hello': 1, 'perfect': 0, 'wonderful': 0}

或者如果您更希望将其作为 list:

>>> list(zip(word_list, itemgetter(*word_list)(cnts)))
[('hello', 1), ('wonderful', 0), ('good', 3), ('flawless', 0), ('perfect', 0)]

你可以尝试使用字典:

word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]

result = {}
for word in word_list:
    result[word]=0
for w in test:
    if result.has_key(w):
        result[w] += 1
print(result)

但是你会以不同的结构结束。 如果你不想这样,你可以试试这个

word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]

result = {}
for w in test:
    if(result.has_key(w)):
        result[w] += 1
    else:
        result[w] = 1
count = [0] * len(word_list)
for i in range(len(word_list)):
    if (result.has_key(word_list[i])):
        count[i]=result[word_list[i]]
print(count)