如何合并嵌套字典?
How to merge nested dictionaries?
我有一个嵌套字典列表 (python 3.9),看起来像这样:
records = [
{'Total:': {'Owner:': {'Available:': {'15 to 34 years': 1242}}}},
{'Total:': {'Owner:': {'Available:': {'35 to 64 years': 5699}}}},
{'Total:': {'Owner:': {'Available:': {'65 years and over': 2098}}}},
{'Total:': {'Owner:': {'No Service:': {'15 to 34 years': 43}}}},
{'Total:': {'Owner:': {'No Service:': {'35 to 64 years': 64}}}},
{'Total:': {'Owner:': {'No Service:': {'65 years and over': 5}}}},
{'Total:': {'Renter:': {'Available:': {'15 to 34 years': 1403}}}},
{'Total:': {'Renter:': {'Available:': {'35 to 64 years': 2059}}}},
{'Total:': {'Renter:': {'Available:': {'65 years and over': 395}}}},
{'Total:': {'Renter:': {'No Service:': {'15 to 34 years': 16}}}},
{'Total:': {'Renter:': {'No Service:': {'35 to 64 years': 24}}}},
{'Total:': {'Renter:': {'No Service:': {'65 years and over': 0}}}},
]
嵌套层次并不总是一致的。上面的示例有 4 个级别(总计、owner/renter、available/no 服务、年龄段),但有些示例只有一个级别,而其他示例有多达 5 个级别。
我想以一种不像 update()
或 {*dict_a, **dict_b}
那样替换最终字典的方式合并数据。
最终输出应如下所示:
combined = {
'Total': {
'Owner': {
'Available': {
'15 to 34 years': 1242,
'35 to 64 years': 5699,
'65 years and over': 2098
},
'No Service:': {
'15 to 34 years': 43,
'35 to 64 years': 64,
'65 years and over': 5
}
},
'Renter': {
'Available': {
'15 to 34 years': 1403,
'35 to 64 years': 2059,
'65 years and over': 395
},
'No Service:': {
'15 to 34 years': 16,
'35 to 64 years': 24,
'65 years and over': 0
}
},
}
}
递归是一种在任意嵌套结构上导航和操作的简便方法:
def combine_into(d: dict, combined: dict) -> None:
for k, v in d.items():
if isinstance(v, dict):
combine_into(v, combined.setdefault(k, {}))
else:
combined[k] = v
combined = {}
for record in records:
combine_into(record, combined)
print(combined)
{'Total:': {'Owner:': {'Available:': {'15 to 34 years': 1242, '35 to 64 years': 5699, '65 years and over': 2098}, 'No Service:': {'15 to 34 years': 43, '35 to 64 years': 64, '65 years and over': 5}}, 'Renter:': {'Available:': {'15 to 34 years': 1403, '35 to 64 years': 2059, '65 years and over': 395}, 'No Service:': {'15 to 34 years': 16, '35 to 64 years': 24, '65 years and over': 0}}}}
这里的总体思路是,每次调用 combine_into
都需要一个字典并将其组合到 combined
字典中——每个本身就是字典的值都会导致另一个递归调用,而其他值只是按原样复制到 combined
。
请注意,如果某些 records
对特定节点是否为叶子存在分歧,这将引发异常(或破坏某些数据)!
我有一个嵌套字典列表 (python 3.9),看起来像这样:
records = [
{'Total:': {'Owner:': {'Available:': {'15 to 34 years': 1242}}}},
{'Total:': {'Owner:': {'Available:': {'35 to 64 years': 5699}}}},
{'Total:': {'Owner:': {'Available:': {'65 years and over': 2098}}}},
{'Total:': {'Owner:': {'No Service:': {'15 to 34 years': 43}}}},
{'Total:': {'Owner:': {'No Service:': {'35 to 64 years': 64}}}},
{'Total:': {'Owner:': {'No Service:': {'65 years and over': 5}}}},
{'Total:': {'Renter:': {'Available:': {'15 to 34 years': 1403}}}},
{'Total:': {'Renter:': {'Available:': {'35 to 64 years': 2059}}}},
{'Total:': {'Renter:': {'Available:': {'65 years and over': 395}}}},
{'Total:': {'Renter:': {'No Service:': {'15 to 34 years': 16}}}},
{'Total:': {'Renter:': {'No Service:': {'35 to 64 years': 24}}}},
{'Total:': {'Renter:': {'No Service:': {'65 years and over': 0}}}},
]
嵌套层次并不总是一致的。上面的示例有 4 个级别(总计、owner/renter、available/no 服务、年龄段),但有些示例只有一个级别,而其他示例有多达 5 个级别。
我想以一种不像 update()
或 {*dict_a, **dict_b}
那样替换最终字典的方式合并数据。
最终输出应如下所示:
combined = {
'Total': {
'Owner': {
'Available': {
'15 to 34 years': 1242,
'35 to 64 years': 5699,
'65 years and over': 2098
},
'No Service:': {
'15 to 34 years': 43,
'35 to 64 years': 64,
'65 years and over': 5
}
},
'Renter': {
'Available': {
'15 to 34 years': 1403,
'35 to 64 years': 2059,
'65 years and over': 395
},
'No Service:': {
'15 to 34 years': 16,
'35 to 64 years': 24,
'65 years and over': 0
}
},
}
}
递归是一种在任意嵌套结构上导航和操作的简便方法:
def combine_into(d: dict, combined: dict) -> None:
for k, v in d.items():
if isinstance(v, dict):
combine_into(v, combined.setdefault(k, {}))
else:
combined[k] = v
combined = {}
for record in records:
combine_into(record, combined)
print(combined)
{'Total:': {'Owner:': {'Available:': {'15 to 34 years': 1242, '35 to 64 years': 5699, '65 years and over': 2098}, 'No Service:': {'15 to 34 years': 43, '35 to 64 years': 64, '65 years and over': 5}}, 'Renter:': {'Available:': {'15 to 34 years': 1403, '35 to 64 years': 2059, '65 years and over': 395}, 'No Service:': {'15 to 34 years': 16, '35 to 64 years': 24, '65 years and over': 0}}}}
这里的总体思路是,每次调用 combine_into
都需要一个字典并将其组合到 combined
字典中——每个本身就是字典的值都会导致另一个递归调用,而其他值只是按原样复制到 combined
。
请注意,如果某些 records
对特定节点是否为叶子存在分歧,这将引发异常(或破坏某些数据)!