如果它们共享任何键值对,如何合并来自单独列表的多个词典?
How to merge multiple dictionaries from separate lists if they share any key-value pairs?
如果多个列表共享一个共同的键值对,如何合并它们?
例如,这里有三个字典列表:
l1 = [{'fruit':'banana','category':'B'},{'fruit':'apple','category':'A'}]
l2 = [{'type':'new','category':'A'},{'type':'old','category':'B'}]
l3 = [{'order':'2','type':'old'},{'order':'1','type':'new'}]
期望的结果:
l = [{'fruit':'apple','category':'A','order':'1','type':'new'},{'fruit':'banana','category':'B','order':'2','type':'old'}]
棘手的部分是我希望这个函数只接受列表作为参数而不是键,因为我只想插入任意数量的字典列表而不关心哪个键-名称是重叠的(在这种情况下,将所有三个名称组合在一起的键名是 'category' 和 'type')。
我要注意索引应该无关紧要,因为它只应基于公共元素。
这是我的尝试:
def combine_lists(*args):
base_list = args[0]
L = []
for sublist in args[1:]:
L.extend(sublist)
for D in base_list:
for Dict in L:
if any([tup in Dict.items() for tup in D.items()]):
D.update(Dict)
return base_list
对于这个问题,将字典视为元组列表很方便:
In [4]: {'fruit':'apple','category':'A'}.items()
Out[4]: [('category', 'A'), ('fruit', 'apple')]
由于我们希望连接共享一个键值对的字典,我们可以将每个
元组作为图中的节点,成对的元组作为边。一旦你有了图表
问题简化为找到图形的连通分量。
使用networkx,
import itertools as IT
import networkx as nx
l1 = [{'fruit':'apple','category':'A'},{'fruit':'banana','category':'B'}]
l2 = [{'type':'new','category':'A'},{'type':'old','category':'B'}]
l3 = [{'order':'1','type':'new'},{'order':'2','type':'old'}]
data = [l1, l2, l3]
G = nx.Graph()
for dct in IT.chain.from_iterable(data):
items = list(dct.items())
node1 = node1[0]
for node2 in items:
G.add_edge(node1, node22)
for cc in nx.connected_component_subgraphs(G):
print(dict(IT.chain.from_iterable(cc.edges())))
产量
{'category': 'A', 'fruit': 'apple', 'type': 'new', 'order': '1'}
{'category': 'B', 'fruit': 'banana', 'type': 'old', 'order': '2'}
如果你想删除 networkx 依赖,你可以使用,例如,pillmuncher's implementation:
import itertools as IT
def connected_components(neighbors):
"""
(pillmuncher)
"""
seen = set()
def component(node):
nodes = set([node])
while nodes:
node = nodes.pop()
seen.add(node)
nodes |= neighbors[node] - seen
yield node
for node in neighbors:
if node not in seen:
yield component(node)
l1 = [{'fruit':'apple','category':'A'},{'fruit':'banana','category':'B'}]
l2 = [{'type':'new','category':'A'},{'type':'old','category':'B'}]
l3 = [{'order':'1','type':'new'},{'order':'2','type':'old'}]
data = [l1, l2, l3]
G = {}
for dct in IT.chain.from_iterable(data):
items = dct.items()
node1 = items[0]
for node2 in items[1:]:
G.setdefault(node1, set()).add(node2)
G.setdefault(node2, set()).add(node1)
for cc in connected_components(G):
print(dict(cc))
打印出与上面相同的结果。
如果多个列表共享一个共同的键值对,如何合并它们?
例如,这里有三个字典列表:
l1 = [{'fruit':'banana','category':'B'},{'fruit':'apple','category':'A'}]
l2 = [{'type':'new','category':'A'},{'type':'old','category':'B'}]
l3 = [{'order':'2','type':'old'},{'order':'1','type':'new'}]
期望的结果:
l = [{'fruit':'apple','category':'A','order':'1','type':'new'},{'fruit':'banana','category':'B','order':'2','type':'old'}]
棘手的部分是我希望这个函数只接受列表作为参数而不是键,因为我只想插入任意数量的字典列表而不关心哪个键-名称是重叠的(在这种情况下,将所有三个名称组合在一起的键名是 'category' 和 'type')。
我要注意索引应该无关紧要,因为它只应基于公共元素。
这是我的尝试:
def combine_lists(*args):
base_list = args[0]
L = []
for sublist in args[1:]:
L.extend(sublist)
for D in base_list:
for Dict in L:
if any([tup in Dict.items() for tup in D.items()]):
D.update(Dict)
return base_list
对于这个问题,将字典视为元组列表很方便:
In [4]: {'fruit':'apple','category':'A'}.items()
Out[4]: [('category', 'A'), ('fruit', 'apple')]
由于我们希望连接共享一个键值对的字典,我们可以将每个 元组作为图中的节点,成对的元组作为边。一旦你有了图表 问题简化为找到图形的连通分量。
使用networkx,
import itertools as IT
import networkx as nx
l1 = [{'fruit':'apple','category':'A'},{'fruit':'banana','category':'B'}]
l2 = [{'type':'new','category':'A'},{'type':'old','category':'B'}]
l3 = [{'order':'1','type':'new'},{'order':'2','type':'old'}]
data = [l1, l2, l3]
G = nx.Graph()
for dct in IT.chain.from_iterable(data):
items = list(dct.items())
node1 = node1[0]
for node2 in items:
G.add_edge(node1, node22)
for cc in nx.connected_component_subgraphs(G):
print(dict(IT.chain.from_iterable(cc.edges())))
产量
{'category': 'A', 'fruit': 'apple', 'type': 'new', 'order': '1'}
{'category': 'B', 'fruit': 'banana', 'type': 'old', 'order': '2'}
如果你想删除 networkx 依赖,你可以使用,例如,pillmuncher's implementation:
import itertools as IT
def connected_components(neighbors):
"""
(pillmuncher)
"""
seen = set()
def component(node):
nodes = set([node])
while nodes:
node = nodes.pop()
seen.add(node)
nodes |= neighbors[node] - seen
yield node
for node in neighbors:
if node not in seen:
yield component(node)
l1 = [{'fruit':'apple','category':'A'},{'fruit':'banana','category':'B'}]
l2 = [{'type':'new','category':'A'},{'type':'old','category':'B'}]
l3 = [{'order':'1','type':'new'},{'order':'2','type':'old'}]
data = [l1, l2, l3]
G = {}
for dct in IT.chain.from_iterable(data):
items = dct.items()
node1 = items[0]
for node2 in items[1:]:
G.setdefault(node1, set()).add(node2)
G.setdefault(node2, set()).add(node1)
for cc in connected_components(G):
print(dict(cc))
打印出与上面相同的结果。