Python 统计 list2 中 list1 的元素出现次数
Python count element occurrence of list1 in list2
在下面的代码中,我想统计word_list
中的每个单词在test
中的出现次数,下面的代码可以做到这一点,但效率可能不高,有没有更好的怎么做?
word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]
result = [0] * len(word_list)
for i in range(len(word_list)):
for w in test:
if w == word_list[i]:
result[i] += 1
print(result)
您可以使用字典在线性时间内完成。
word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]
result = []
word_map = {}
for w in test:
if w in word_map:
word_map[w] += 1
else:
word_map[w] = 1
for w in word_list:
result.append(word_map.get(w, 0))
print(result)
使用collections.Counter
一次计算test
中的所有单词,然后从Counter
中计算word_list
中的每个单词。
>>> word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
>>> test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]
>>> counts = collections.Counter(test)
>>> [counts[w] for w in word_list]
[1, 0, 3, 0, 0]
或者用字典理解:
>>> {w: counts[w] for w in word_list}
{'perfect': 0, 'flawless': 0, 'good': 3, 'wonderful': 0, 'hello': 1}
创建计数器应该是 O(n),每次查找 O(1),对于 test
中的 n 个词和 word_list
中的 m 个词,给你 O(n+m)。
您可以组合 collections.Counter
and operator.itemgetter
:
from collections import Counter
from operator import itemgetter
cnts = Counter(test)
word_cnts = dict(zip(word_list, itemgetter(*word_list)(cnts)))
给出:
>>> word_cnts
{'flawless': 0, 'good': 3, 'hello': 1, 'perfect': 0, 'wonderful': 0}
或者如果您更希望将其作为 list
:
>>> list(zip(word_list, itemgetter(*word_list)(cnts)))
[('hello', 1), ('wonderful', 0), ('good', 3), ('flawless', 0), ('perfect', 0)]
你可以尝试使用字典:
word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]
result = {}
for word in word_list:
result[word]=0
for w in test:
if result.has_key(w):
result[w] += 1
print(result)
但是你会以不同的结构结束。
如果你不想这样,你可以试试这个
word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]
result = {}
for w in test:
if(result.has_key(w)):
result[w] += 1
else:
result[w] = 1
count = [0] * len(word_list)
for i in range(len(word_list)):
if (result.has_key(word_list[i])):
count[i]=result[word_list[i]]
print(count)
在下面的代码中,我想统计word_list
中的每个单词在test
中的出现次数,下面的代码可以做到这一点,但效率可能不高,有没有更好的怎么做?
word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]
result = [0] * len(word_list)
for i in range(len(word_list)):
for w in test:
if w == word_list[i]:
result[i] += 1
print(result)
您可以使用字典在线性时间内完成。
word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]
result = []
word_map = {}
for w in test:
if w in word_map:
word_map[w] += 1
else:
word_map[w] = 1
for w in word_list:
result.append(word_map.get(w, 0))
print(result)
使用collections.Counter
一次计算test
中的所有单词,然后从Counter
中计算word_list
中的每个单词。
>>> word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
>>> test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]
>>> counts = collections.Counter(test)
>>> [counts[w] for w in word_list]
[1, 0, 3, 0, 0]
或者用字典理解:
>>> {w: counts[w] for w in word_list}
{'perfect': 0, 'flawless': 0, 'good': 3, 'wonderful': 0, 'hello': 1}
创建计数器应该是 O(n),每次查找 O(1),对于 test
中的 n 个词和 word_list
中的 m 个词,给你 O(n+m)。
您可以组合 collections.Counter
and operator.itemgetter
:
from collections import Counter
from operator import itemgetter
cnts = Counter(test)
word_cnts = dict(zip(word_list, itemgetter(*word_list)(cnts)))
给出:
>>> word_cnts
{'flawless': 0, 'good': 3, 'hello': 1, 'perfect': 0, 'wonderful': 0}
或者如果您更希望将其作为 list
:
>>> list(zip(word_list, itemgetter(*word_list)(cnts)))
[('hello', 1), ('wonderful', 0), ('good', 3), ('flawless', 0), ('perfect', 0)]
你可以尝试使用字典:
word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]
result = {}
for word in word_list:
result[word]=0
for w in test:
if result.has_key(w):
result[w] += 1
print(result)
但是你会以不同的结构结束。 如果你不想这样,你可以试试这个
word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]
result = {}
for w in test:
if(result.has_key(w)):
result[w] += 1
else:
result[w] = 1
count = [0] * len(word_list)
for i in range(len(word_list)):
if (result.has_key(word_list[i])):
count[i]=result[word_list[i]]
print(count)