python 中的元组分组列表

Grouping list of tuples in python

我有一个由元组组成的列表,我已经根据第二项对该列表进行了排序。然后我想让我的列表分组 基于第二项,并将第一项放入列表中。

这是我的输入:

[('aaa', 1), ('bbb', 1), ('ccc', 2), ('ddd', 2), ('eee', 3)]

我需要的是:

[(g1, 1, ['aaa', 'bbb']), (g2, 2, ['ccc', 'ddd']), (g3, 1, ['eee'])]

每个元组,第一项是一个id(增量)。第二项是按其分组分组的项目数,第三项是分组元组的列表。 如何在 python 中实施此输入?已经尝试使用 itertools,仍然一无所获。任何帮助将不胜感激。

In [5]: L = [('aaa', 1), ('bbb', 1), ('ccc', 2), ('ddd', 2), ('eee', 3)]

In [6]: for key, group in itertools.groupby(L, operator.itemgetter(1)):
   ...:     print(key, list(group))
   ...:     
1 [('aaa', 1), ('bbb', 1)]
2 [('ccc', 2), ('ddd', 2)]
3 [('eee', 3)]

In [7]: answer = []

In [8]: for k,group in itertools.groupby(L, operator.itemgetter(1)):
   ...:     answer.append((k, [g[0] for g in group]))
   ...:     

In [9]: answer
Out[9]: [(1, ['aaa', 'bbb']), (2, ['ccc', 'ddd']), (3, ['eee'])]

如果你知道如何使用collections模块,很容易解决。

from collections import defaultdict

a = [('aaa', 1), ('bbb', 1), ('ccc', 2), ('ddd', 2), ('eee', 3)]

d = defaultdict(list)
for k, v in a:   
    d[v].append(k)

print d.items()
# [(1, ['aaa', 'bbb']), (2, ['ccc', 'ddd']), (3, ['eee'])]

一种方法是分步进行:

>>> grouped = enumerate(groupby(seq, key=lambda x: x[1]), 1)
>>> extracted = ((i, [g[0] for g in gg]) for i, (k,gg) in grouped)
>>> final = [(i, len(x), x) for i,x in extracted]
>>> final
[(1, 2, ['aaa', 'bbb']), (2, 2, ['ccc', 'ddd']), (3, 1, ['eee'])]

但即使每一行本身都有意义,我认为很难看出它实际在做什么。使用生成器函数让一切变得更清晰:

def grouper(elems):
    grouped = groupby(elems, key=lambda x: x[1])
    for i, (k, group) in enumerate(grouped, 1):
        vals = [g[0] for g in group]
        yield i, len(vals), vals

>> list(grouper(seq))
[(1, 2, ['aaa', 'bbb']), (2, 2, ['ccc', 'ddd']), (3, 1, ['eee'])]

(这里我为你的 g1/g2/g3 任意使用了一个从 1 开始的索引;用 yield 'g{}'.format(i) 之类的东西替换它会很容易。)