如何计算列表的元素频率

How to count element frequency for a list

我有一个包含 3-gram 的列表列表。样本输入列表是这样的:

A = [ ['the','big','bang'],['big','bang','theory'],...,['the','big','bang']

如何计算这些列表的频率(出现次数)? Python 抱怨列表不可哈希。

对于目前的情况,我希望得到

dict[['the','big','bang'] = 2
dict[['big','bang','theory']] = 1

谢谢,

列表不可散列,这就是为什么不能将它们作为字典的键。将内部列表转换为元组,以便您可以使用字典进行计数;甚至更好,使用 Counter:

from collections import Counter
A = [['the','big','bang'],['big','bang','theory'],['the','big','bang']]
cnt = Counter(map(tuple, A))
for k, v in cnt.iteritems():
    print list(k), v

输出:

['big', 'bang', 'theory'] 1
['the', 'big', 'bang'] 2

您可以通过字典理解在一行中完成此操作:

data = {i: A.count(list(i)) for i in set([tuple(j) for j in A])}

如果您的内部列表可以是元组,例如:

A = [('the', 'big', 'bang'), ('big', 'bang', 'theory'), ('the', 'big', 'bang')]

你可以这样做:

result = {a:A.count(a) for a in set(A)} # dict comprehension
print result
{('big', 'bang', 'theory'): 1, ('the', 'big', 'bang'): 2}
# order of individual lists matters
A = [ ['the','big','bang'],['big','bang','theory'],['the','big','bang']]
x = {}
for val in A:
    #val.sort() uncomment this if order within sublists does not matter
    x[str(val)] = x.setdefault(str(val), 0) + 1
print x