将列表列表缩减为字典，子列表大小作为键，出现次数作为值

Question

我有一个列表列表，我想计算特定大小的子列表出现的次数。

例如。对于列表 [[1], [1,2], [1,2], [1,2,3]] 我希望得到 {1: 1, 2: 2, 3: 1}

我试过 reduce 函数，但我在 += 1 上有语法错误，不知道哪里出了问题。

list_of_list = [[1], [1,2], [1,2], [1,2,3]]
result = functools.reduce(lambda dict,list: dict[len(list)] += 1, list_of_list, defaultdict(lambda: 0, {}))

Answer 1

您也可以使用 Counter 执行此操作：

list_of_list = [[1], [1,2], [1,2], [1,2,3]]
c = Counter(len(i) for i in list_of_list)

输出：

Counter({2: 2, 1: 1, 3: 1})

Answer 2

当您可以以更 Pythonic 的方式将 collections.Counter() 与 map() 函数一起使用时，以如此复杂的方式使用 reduce 并不是一个好主意：

>>> A = [[1], [1,2], [1,2], [1,2,3]]
>>> from collections import Counter
>>> 
>>> Counter(map(len,A))
Counter({2: 2, 1: 1, 3: 1})

请注意，使用 map 会比生成器表达式稍微好一些，因为通过将生成器表达式传递给 Counter() python 将从生成器函数本身获取值，因为使用内置函数 map 在执行时间方面具有更多性能¹.

~$ python -m timeit --setup "A = [[1], [1,2], [1,2], [1,2,3]];from collections import Counter" "Counter(map(len,A))"
100000 loops, best of 3: 4.7 usec per loop
~$ python -m timeit --setup "A = [[1], [1,2], [1,2], [1,2,3]];from collections import Counter" "Counter(len(x) for x in A)"
100000 loops, best of 3: 4.73 usec per loop

来自 PEP 0289 -- Generator Expressions:

The semantics of a generator expression are equivalent to creating an anonymous generator function and calling it. For example:
g = (x**2 for x in range(10))
print g.next()
is equivalent to:
def __gen(exp):
    for x in exp:
        yield x**2
g = __gen(iter(range(10)))
print g.next()

_{请注意，由于 generator expressions 在内存使用方面更好，如果您正在处理大数据，您最好使用 generator expression 而不是地图函数。}

Answer 3

reduce 是这项工作的劣质工具。

改为查看 collections.Counter。它是一个 dict 子类，因此您应该能够使用它，但是您打算使用 dict。

>>> from collections import Counter
>>> L = [[1], [1, 2], [1, 2], [1, 2, 3]]
>>> Counter(len(x) for x in L)
Counter({1: 1, 2: 2, 3: 1})

将列表列表缩减为字典，子列表大小作为键，出现次数作为值

Reduce list of list to dictionary with sublist size as keys and number of occurances as value

python

reduce

lambda

dictionary