按最常见的字段对 namedtupled 列表进行排序
Sort a list of namedtupled by the most frequent fields
根据列表中出现频率最高的元素对 namedtuple
列表进行排序有哪些优雅快捷的方法?
例如,我们有这个列表
character_list = [
Element(id=1, character='A'),
Element(id=2, character='B'),
Element(id=3, character='B'),
Element(id=4, character='C'),
Element(id=5, character='D'),
Element(id=6, character='E'),
Element(id=7, character='F'),
Element(id=8, character='H'),
Element(id=9, character='I'),
Element(id=10, character='J'),
Element(id=11, character='K'),
Element(id=12, character='L'),
Element(id=13, character='M'),
Element(id=14, character='J'),
Element(id=15, character='N'),
Element(id=16, character='J')]
然后这样排序?
character_list = [
Element(id=10, character='J'),
Element(id=14, character='J'),
Element(id=16, character='J'),
Element(id=2, character='B'),
Element(id=3, character='B'),
Element(id=1, character='A'),
Element(id=4, character='C'),
Element(id=5, character='D'),
Element(id=6, character='E'),
Element(id=7, character='F'),
Element(id=8, character='H'),
Element(id=9, character='I'),
Element(id=11, character='K'),
Element(id=12, character='L'),
Element(id=13, character='M'),
Element(id=14, character='J'),
Element(id=15, character='N')]
试试这个,但似乎没有我想要的结果
sorted(character_list, key=lambda x: character_list.count(x.character))
x.character
从未出现在您的列表中。无论如何,像这样使用 list.count
是 非常低效的 。排序是 O(N*log N),但是,如果你的键函数使用 list.count
,它会使一切恶化到 O(N**2)。
相反,构建一个计数字典并使用该字典,这将保持您的 O(N*log N) 性能。所以给出:
>>> from pprint import pprint
>>> pprint(character_list)
[Element(id=1, character='A'),
Element(id=2, character='B'),
Element(id=3, character='B'),
Element(id=4, character='C'),
Element(id=5, character='D'),
Element(id=6, character='E'),
Element(id=7, character='F'),
Element(id=8, character='H'),
Element(id=9, character='I'),
Element(id=10, character='J'),
Element(id=11, character='K'),
Element(id=12, character='L'),
Element(id=13, character='M'),
Element(id=14, character='J'),
Element(id=15, character='N'),
Element(id=16, character='J')]
然后
>>> from collections import Counter
>>> counts = Counter(e.character for e in character_list)
>>> counts
Counter({'J': 3, 'B': 2, 'A': 1, 'C': 1, 'D': 1, 'E': 1, 'F': 1, 'H': 1, 'I': 1, 'K': 1, 'L': 1, 'M': 1, 'N': 1})
最后,
>>> def keyfunc(e):
... return counts[e.character]
...
>>> sorted_character_list = sorted(character_list, key=keyfunc, reverse=True)
>>> pprint(sorted_character_list)
[Element(id=10, character='J'),
Element(id=14, character='J'),
Element(id=16, character='J'),
Element(id=2, character='B'),
Element(id=3, character='B'),
Element(id=1, character='A'),
Element(id=4, character='C'),
Element(id=5, character='D'),
Element(id=6, character='E'),
Element(id=7, character='F'),
Element(id=8, character='H'),
Element(id=9, character='I'),
Element(id=11, character='K'),
Element(id=12, character='L'),
Element(id=13, character='M'),
Element(id=15, character='N')]
根据列表中出现频率最高的元素对 namedtuple
列表进行排序有哪些优雅快捷的方法?
例如,我们有这个列表
character_list = [
Element(id=1, character='A'),
Element(id=2, character='B'),
Element(id=3, character='B'),
Element(id=4, character='C'),
Element(id=5, character='D'),
Element(id=6, character='E'),
Element(id=7, character='F'),
Element(id=8, character='H'),
Element(id=9, character='I'),
Element(id=10, character='J'),
Element(id=11, character='K'),
Element(id=12, character='L'),
Element(id=13, character='M'),
Element(id=14, character='J'),
Element(id=15, character='N'),
Element(id=16, character='J')]
然后这样排序?
character_list = [
Element(id=10, character='J'),
Element(id=14, character='J'),
Element(id=16, character='J'),
Element(id=2, character='B'),
Element(id=3, character='B'),
Element(id=1, character='A'),
Element(id=4, character='C'),
Element(id=5, character='D'),
Element(id=6, character='E'),
Element(id=7, character='F'),
Element(id=8, character='H'),
Element(id=9, character='I'),
Element(id=11, character='K'),
Element(id=12, character='L'),
Element(id=13, character='M'),
Element(id=14, character='J'),
Element(id=15, character='N')]
试试这个,但似乎没有我想要的结果
sorted(character_list, key=lambda x: character_list.count(x.character))
x.character
从未出现在您的列表中。无论如何,像这样使用 list.count
是 非常低效的 。排序是 O(N*log N),但是,如果你的键函数使用 list.count
,它会使一切恶化到 O(N**2)。
相反,构建一个计数字典并使用该字典,这将保持您的 O(N*log N) 性能。所以给出:
>>> from pprint import pprint
>>> pprint(character_list)
[Element(id=1, character='A'),
Element(id=2, character='B'),
Element(id=3, character='B'),
Element(id=4, character='C'),
Element(id=5, character='D'),
Element(id=6, character='E'),
Element(id=7, character='F'),
Element(id=8, character='H'),
Element(id=9, character='I'),
Element(id=10, character='J'),
Element(id=11, character='K'),
Element(id=12, character='L'),
Element(id=13, character='M'),
Element(id=14, character='J'),
Element(id=15, character='N'),
Element(id=16, character='J')]
然后
>>> from collections import Counter
>>> counts = Counter(e.character for e in character_list)
>>> counts
Counter({'J': 3, 'B': 2, 'A': 1, 'C': 1, 'D': 1, 'E': 1, 'F': 1, 'H': 1, 'I': 1, 'K': 1, 'L': 1, 'M': 1, 'N': 1})
最后,
>>> def keyfunc(e):
... return counts[e.character]
...
>>> sorted_character_list = sorted(character_list, key=keyfunc, reverse=True)
>>> pprint(sorted_character_list)
[Element(id=10, character='J'),
Element(id=14, character='J'),
Element(id=16, character='J'),
Element(id=2, character='B'),
Element(id=3, character='B'),
Element(id=1, character='A'),
Element(id=4, character='C'),
Element(id=5, character='D'),
Element(id=6, character='E'),
Element(id=7, character='F'),
Element(id=8, character='H'),
Element(id=9, character='I'),
Element(id=11, character='K'),
Element(id=12, character='L'),
Element(id=13, character='M'),
Element(id=15, character='N')]