删除 python 个键的字典条目,其值是另一个键的子集
Remove python dictionary enteries for keys with values that are a subset of another key
我有一个使用 defaultdict
:
生成的字典
{"GGGAAATTTCCCTTTGGGAAACGG": ["9/1", "9/2", "1/1.1", "9/2.1"],
"GGGAAATTTCCCTTTGGGAAAGCC": ["9/2", "9/2.1"],
"GGGAAATTTCCCTTTGGGAAAGGG": ["1/1", "1/2", "9/1", "1/1.1"]}
其中一个条目在其值方面是另一个条目的子集:
"GGGAAATTTCCCTTTGGGAAAGCC": ["9/2", "9/2.1"]
是
的子集
"GGGAAATTTCCCTTTGGGAAACGG": ["9/1", "9/2", "1/1.1", "9/2.1"]
我将如何折叠字典以便最终得到这些结果中的任何一个?
{"GGGAAATTTCCCTTTGGGAAACGG": ["9/1", "9/2", "1/1.1", "9/2.1"],
"GGGAAATTTCCCTTTGGGAAAGGG": ["1/1", "1/2", "9/1", "1/1.1"]}
或
{["GGGAAATTTCCCTTTGGGAAACGG", "GGGAAATTTCCCTTTGGGAAAGCC"]:
["9/1", "9/2", "1/1.1", "9/2.1"],
"GGGAAATTTCCCTTTGGGAAAGGG":
["1/1", "1/2", "9/1", "1/1.1"]}
编辑:
所以按照要求这是我的尝试:
#dd is my defaultdict
for keys, values in dd.iteritems():
if all(for item in values:
if item in dd.items():
return True
else:
return False):
print keys
你可以试试这个
mydict = {"GGGAAATTTCCCTTTGGGAAACGG": ["9/1", "9/2", "1/1.1", "9/2.1"],
"GGGAAATTTCCCTTTGGGAAAGCC": ["9/2", "9/2.1"],
"GGGAAATTTCCCTTTGGGAAAGGG": ["1/1", "1/2", "9/1", "1/1.1"]}
>>>dict([i for i in mydict.items() if not any(set(j).issuperset(set(i[1])) and j!=i[1] for j in mydict.values())])
{'GGGAAATTTCCCTTTGGGAAACGG': ['9/1', '9/2', '1/1.1', '9/2.1'],
'GGGAAATTTCCCTTTGGGAAAGGG': ['1/1', '1/2', '9/1', '1/1.1']}
或者干脆
for i in mydict.items():
for j in mydict.values():
if i[1]!=j:
if set(j).issuperset(set(i[1])):
mydict.pop(i[0])
>>>mydict
{'GGGAAATTTCCCTTTGGGAAACGG': ['9/1', '9/2', '1/1.1', '9/2.1'],
'GGGAAATTTCCCTTTGGGAAAGGG': ['1/1', '1/2', '9/1', '1/1.1']}
我有一个使用 defaultdict
:
{"GGGAAATTTCCCTTTGGGAAACGG": ["9/1", "9/2", "1/1.1", "9/2.1"],
"GGGAAATTTCCCTTTGGGAAAGCC": ["9/2", "9/2.1"],
"GGGAAATTTCCCTTTGGGAAAGGG": ["1/1", "1/2", "9/1", "1/1.1"]}
其中一个条目在其值方面是另一个条目的子集:
"GGGAAATTTCCCTTTGGGAAAGCC": ["9/2", "9/2.1"]
是
的子集"GGGAAATTTCCCTTTGGGAAACGG": ["9/1", "9/2", "1/1.1", "9/2.1"]
我将如何折叠字典以便最终得到这些结果中的任何一个?
{"GGGAAATTTCCCTTTGGGAAACGG": ["9/1", "9/2", "1/1.1", "9/2.1"],
"GGGAAATTTCCCTTTGGGAAAGGG": ["1/1", "1/2", "9/1", "1/1.1"]}
或
{["GGGAAATTTCCCTTTGGGAAACGG", "GGGAAATTTCCCTTTGGGAAAGCC"]:
["9/1", "9/2", "1/1.1", "9/2.1"],
"GGGAAATTTCCCTTTGGGAAAGGG":
["1/1", "1/2", "9/1", "1/1.1"]}
编辑:
所以按照要求这是我的尝试:
#dd is my defaultdict
for keys, values in dd.iteritems():
if all(for item in values:
if item in dd.items():
return True
else:
return False):
print keys
你可以试试这个
mydict = {"GGGAAATTTCCCTTTGGGAAACGG": ["9/1", "9/2", "1/1.1", "9/2.1"],
"GGGAAATTTCCCTTTGGGAAAGCC": ["9/2", "9/2.1"],
"GGGAAATTTCCCTTTGGGAAAGGG": ["1/1", "1/2", "9/1", "1/1.1"]}
>>>dict([i for i in mydict.items() if not any(set(j).issuperset(set(i[1])) and j!=i[1] for j in mydict.values())])
{'GGGAAATTTCCCTTTGGGAAACGG': ['9/1', '9/2', '1/1.1', '9/2.1'],
'GGGAAATTTCCCTTTGGGAAAGGG': ['1/1', '1/2', '9/1', '1/1.1']}
或者干脆
for i in mydict.items():
for j in mydict.values():
if i[1]!=j:
if set(j).issuperset(set(i[1])):
mydict.pop(i[0])
>>>mydict
{'GGGAAATTTCCCTTTGGGAAACGG': ['9/1', '9/2', '1/1.1', '9/2.1'],
'GGGAAATTTCCCTTTGGGAAAGGG': ['1/1', '1/2', '9/1', '1/1.1']}