在 Python 中对嵌套的 defaultdict 进行排序

Sorting nested defaultdicts in Python

我正在嵌套 2 层默认指令。内部字典包含许多字段,我想按其中一个值对其进行排序,并删除与最低值对应的条目。

这是一个简化的代码示例:

from collections import defaultdict

sampleDict = defaultdict(lambda: defaultdict(lambda:defaultdict(lambda:str)))

sampleDict['keyA']['keyB']['size'] = 1000
sampleDict['keyA']['keyC']['size'] = 500
sampleDict['keyA']['keyD']['size'] = 750
sampleDict['keyA']['keyE']['size'] = 250
sampleDict['keyA']['keyB']['desc'] = 'some data'
sampleDict['keyA']['keyC']['desc'] = 'some more data'
sampleDict['keyA']['keyD']['desc'] = 'different data'
sampleDict['keyA']['keyE']['desc'] = 'other data'

在这种情况下,我想排序并确定最高的size['keyA']['keyB'],第二高的是['keyA']['keyD'],然后删除['keyA']['keyC']['keyA']['keyE'].

它被嵌套的原因是因为我将循环遍历外部字典中的其他条目。

试试这个:

>>> import operator
>>> sorted(
...     reduce(operator.add, 
...     [[(k, k1, sampleDict[k][k1]['size']) for k1 in v.keys()]
...              for k,v in sampleDict.items()]
...     ),
...     key=lambda x: x[2], reverse=True)
[('keyA', 'keyB', 1000), ('keyA', 'keyD', 750), ('keyA', 'keyC', 500), ('keyA', 'keyE', 250)]

reduce语句用于将嵌套列表[[a],[b,c],[d]]变成[a,b,c]

排序语句的关键参数指定对 (k,k1,val) 的(包含零的)第二个参数进行排序,即 val.

反向参数按降序排列列表。

>>> import heapq
>>> [(k, heapq.nlargest(2, sampleDict[k], lambda x: sampleDict[k][x]['size']))
...   for k in sampleDict]
[('keyA', ['keyB', 'keyD'])]

如果不关心dict.items对Python2/3的区别,也可以写成

>>> [(k, heapq.nlargest(2, v, lambda x: v[x]['size'])) for k,v in sampleDict.items()]
[('keyA', ['keyB', 'keyD'])]