如何在多个字典中获取具有最高值的每个键的字典名称
How to get the dictionary name for each key with the highest value across multiple dictionaries
我有一个名为 AllNames
的唯一名称列表和几个子列表,其中这些名称可以重复,或者根本找不到。例如
AllNames = ['John', 'Mark', 'Tony', 'Bob', 'Jack']
blue = ['John', 'John', 'Mark', 'Jack']
green = ['Mark', 'Mark', 'Jack']
red = ['Bob', 'Jack']
# These are dictionaries with the counts for each list
blueCounter = Counter({'John': 2, 'Mark': 1, 'Jack': 1})
greenCounter = Counter({'Mark': 2, 'Jack': 1})
redCounter = Counter({'Bob': 1, 'Jack': 1})
对于 AllNames
中的每个 name
,我想获取 name
中计数最高的列表的名称,并将其放入新列表中。基于以上,我会得到:
blueList = ['John'] # since John has the highest count across all lists
greenList = ['Mark']
redList =['Bob']
请注意,对于 'Jack'
,谁在所有三个列表中出现的次数相同,我不希望他的名字出现在任何地方。同样,'Tony'
,其名称未出现在任何列表中,不应包括在内。
我正在尝试应用 stats.iteritems
但这在一本字典中获得了最高值。
编辑:另一个例子
a=['John', 'John', 'John', 'Mark', 'Mark', 'Mark', 'Joe']
b= ['John', 'Mark', 'Joe', 'Joe', 'Joe', 'Jack']
c= ['Mark', 'Joe', 'Jack', 'Jack', 'Tony']
ac = Counter(a)
bc = Counter(b)
cc = Counter(c)
# >>> ac
# Counter({'John': 3, 'Mark': 3, 'Joe': 1})
# >>> bc
# Counter({'Joe': 3, 'Jack': 1, 'John': 1, 'Mark': 1})
# >>> cc
# Counter({'Jack': 2, 'Tony': 1, 'Joe': 1, 'Mark': 1})
结果应该是:
alist = ['John', 'Mark']
blist = ['Joe']
clist = ['Jack', 'Tony']
你甚至不需要AllNames
:
from collections import Counter
lists = { 'blue': ['John', 'John', 'Mark', 'Jack'], 'green': ['Mark', 'Mark', 'Jack'], 'red': ['Bob', 'Jack']}
for li in lists:
print li, Counter(lists[li]).most_common()[0]
>> blue ('John', 2)
green ('Mark', 2)
red ('Bob', 1)
如果您想创建一个从列表名称到最常见名称的字典,只需执行以下操作:
most_common_dict = {}
for li in lists:
most_common_dict[li] = Counter(lists[li]).most_common()[0][0]
>> {'blue': 'John', 'green': 'Mark', 'red': 'Bob'}
如果您使用的是 Python 2.7
或更高版本,则在 1 行中:
most_common_dict = { li: Counter(lists[li]).most_common()[0][0] for li in lists }
像这样:
from collections import Counter
AllNames = ['John', 'Mark', 'Tony', 'Bob', 'Jack']
blue = ['John', 'John', 'Mark', 'Jack']
green = ['Mark', 'Mark', 'Jack']
red = ['Bob', 'Jack']
# These are dictionaries with the counts for each list
blueCounter = Counter({'John': 2, 'Mark': 1, 'Jack': 1})
greenCounter = Counter({'Mark': 2, 'Jack': 1})
redCounter = Counter({'Bob': 1, 'Jack': 1})
names = []
for counter in blueCounter, greenCounter, redCounter:
name_with_highest_count = counter.most_common()[0][0]
if name_with_highest_count in AllNames:
names.append([name_with_highest_count])
blueList, greenList, redList = names
print(blueList) # -> ['John']
print(greenList) # -> ['Mark']
print(redList) # -> ['Bob']
应该这样做
import numpy as np
from collections import Counter
a=['John', 'John', 'John', 'Mark', 'Mark', 'Mark', 'Joe']
b= ['John', 'Mark', 'Joe', 'Joe', 'Joe', 'Jack']
c= ['Mark', 'Joe', 'Jack', 'Jack', 'Tony']
ac = Counter(a)
bc = Counter(b)
cc = Counter(c)
allLists = [list(), list(), list()]
allNames = set(ac.keys() + bc.keys() + cc.keys());
for name in allNames:
aCount = ac[name];
bCount = bc[name];
cCount = cc[name];
allCounts = np.array([aCount, bCount, cCount]);
maxIndex = allCounts.argsort()[::-1][0];
allLists[maxIndex] += [name];
alist, blist, clist = allLists[:]
print alist, blist, clist
正如我在 下的评论中所说,您已经更改了问题。除此之外,现在有 3 个明确命名的结果列表 — alist
、blist
和 clist
— 并且还给出 "for each key in the union of all dictionaries, the name of the dictionary where that key has the highest value" 是没有意义的。
后面引用的部分意味着每个键都可以有一个字典名称,并且由于总共有超过 3 个键,因此可能有超过 3 个 "results"。
这里尝试忽略这种不一致并实现您在引用中所说的内容:
from collections import Counter
a = ['John', 'John', 'John', 'Mark', 'Mark', 'Mark', 'Joe']
b = ['John', 'Mark', 'Joe', 'Joe', 'Joe', 'Jack']
c = ['Mark', 'Joe', 'Jack', 'Jack', 'Tony']
ac = Counter(a)
bc = Counter(b)
cc = Counter(c)
print('ac: {}'.format(ac))
print('bc: {}'.format(bc))
print('cc: {}'.format(cc))
print('')
key_dict = {} # maps key to name of dictionary where it has highest value
for key in set(a+b+c): # set(a+b+c) == all keys
highest_dict_name, highest_value = None, -1
for dict_name, counter in (('ac', ac), ('bc', bc), ('cc', cc)):
s = set(counter.values())
d = {j : [i for i in counter if counter[i] == j] for j in s}
highest_item = sorted(d.items(), key=lambda i: i[0])[-1]
if (highest_item[0] > highest_value and
key in highest_item[1]):
highest_dict_name, highest_value = dict_name, highest_item[0]
key_dict[key] = highest_dict_name
print('For each key, dictionary were it has highest value:')
for key in sorted(key_dict):
print(' {!r} in {!r}'.format(key, key_dict[key]))
输出:
ac: Counter({'John': 3, 'Mark': 3, 'Joe': 1})
bc: Counter({'Joe': 3, 'Jack': 1, 'John': 1, 'Mark': 1})
cc: Counter({'Jack': 2, 'Tony': 1, 'Joe': 1, 'Mark': 1})
For each key, dictionary were it has highest value:
'Jack' in 'cc'
'Joe' in 'bc'
'John' in 'ac'
'Mark' in 'ac'
'Tony' in None
我有一个名为 AllNames
的唯一名称列表和几个子列表,其中这些名称可以重复,或者根本找不到。例如
AllNames = ['John', 'Mark', 'Tony', 'Bob', 'Jack']
blue = ['John', 'John', 'Mark', 'Jack']
green = ['Mark', 'Mark', 'Jack']
red = ['Bob', 'Jack']
# These are dictionaries with the counts for each list
blueCounter = Counter({'John': 2, 'Mark': 1, 'Jack': 1})
greenCounter = Counter({'Mark': 2, 'Jack': 1})
redCounter = Counter({'Bob': 1, 'Jack': 1})
对于 AllNames
中的每个 name
,我想获取 name
中计数最高的列表的名称,并将其放入新列表中。基于以上,我会得到:
blueList = ['John'] # since John has the highest count across all lists
greenList = ['Mark']
redList =['Bob']
请注意,对于 'Jack'
,谁在所有三个列表中出现的次数相同,我不希望他的名字出现在任何地方。同样,'Tony'
,其名称未出现在任何列表中,不应包括在内。
我正在尝试应用 stats.iteritems
但这在一本字典中获得了最高值。
编辑:另一个例子
a=['John', 'John', 'John', 'Mark', 'Mark', 'Mark', 'Joe']
b= ['John', 'Mark', 'Joe', 'Joe', 'Joe', 'Jack']
c= ['Mark', 'Joe', 'Jack', 'Jack', 'Tony']
ac = Counter(a)
bc = Counter(b)
cc = Counter(c)
# >>> ac
# Counter({'John': 3, 'Mark': 3, 'Joe': 1})
# >>> bc
# Counter({'Joe': 3, 'Jack': 1, 'John': 1, 'Mark': 1})
# >>> cc
# Counter({'Jack': 2, 'Tony': 1, 'Joe': 1, 'Mark': 1})
结果应该是:
alist = ['John', 'Mark']
blist = ['Joe']
clist = ['Jack', 'Tony']
你甚至不需要AllNames
:
from collections import Counter
lists = { 'blue': ['John', 'John', 'Mark', 'Jack'], 'green': ['Mark', 'Mark', 'Jack'], 'red': ['Bob', 'Jack']}
for li in lists:
print li, Counter(lists[li]).most_common()[0]
>> blue ('John', 2)
green ('Mark', 2)
red ('Bob', 1)
如果您想创建一个从列表名称到最常见名称的字典,只需执行以下操作:
most_common_dict = {}
for li in lists:
most_common_dict[li] = Counter(lists[li]).most_common()[0][0]
>> {'blue': 'John', 'green': 'Mark', 'red': 'Bob'}
如果您使用的是 Python 2.7
或更高版本,则在 1 行中:
most_common_dict = { li: Counter(lists[li]).most_common()[0][0] for li in lists }
像这样:
from collections import Counter
AllNames = ['John', 'Mark', 'Tony', 'Bob', 'Jack']
blue = ['John', 'John', 'Mark', 'Jack']
green = ['Mark', 'Mark', 'Jack']
red = ['Bob', 'Jack']
# These are dictionaries with the counts for each list
blueCounter = Counter({'John': 2, 'Mark': 1, 'Jack': 1})
greenCounter = Counter({'Mark': 2, 'Jack': 1})
redCounter = Counter({'Bob': 1, 'Jack': 1})
names = []
for counter in blueCounter, greenCounter, redCounter:
name_with_highest_count = counter.most_common()[0][0]
if name_with_highest_count in AllNames:
names.append([name_with_highest_count])
blueList, greenList, redList = names
print(blueList) # -> ['John']
print(greenList) # -> ['Mark']
print(redList) # -> ['Bob']
应该这样做
import numpy as np
from collections import Counter
a=['John', 'John', 'John', 'Mark', 'Mark', 'Mark', 'Joe']
b= ['John', 'Mark', 'Joe', 'Joe', 'Joe', 'Jack']
c= ['Mark', 'Joe', 'Jack', 'Jack', 'Tony']
ac = Counter(a)
bc = Counter(b)
cc = Counter(c)
allLists = [list(), list(), list()]
allNames = set(ac.keys() + bc.keys() + cc.keys());
for name in allNames:
aCount = ac[name];
bCount = bc[name];
cCount = cc[name];
allCounts = np.array([aCount, bCount, cCount]);
maxIndex = allCounts.argsort()[::-1][0];
allLists[maxIndex] += [name];
alist, blist, clist = allLists[:]
print alist, blist, clist
正如我在 alist
、blist
和 clist
— 并且还给出 "for each key in the union of all dictionaries, the name of the dictionary where that key has the highest value" 是没有意义的。
后面引用的部分意味着每个键都可以有一个字典名称,并且由于总共有超过 3 个键,因此可能有超过 3 个 "results"。
这里尝试忽略这种不一致并实现您在引用中所说的内容:
from collections import Counter
a = ['John', 'John', 'John', 'Mark', 'Mark', 'Mark', 'Joe']
b = ['John', 'Mark', 'Joe', 'Joe', 'Joe', 'Jack']
c = ['Mark', 'Joe', 'Jack', 'Jack', 'Tony']
ac = Counter(a)
bc = Counter(b)
cc = Counter(c)
print('ac: {}'.format(ac))
print('bc: {}'.format(bc))
print('cc: {}'.format(cc))
print('')
key_dict = {} # maps key to name of dictionary where it has highest value
for key in set(a+b+c): # set(a+b+c) == all keys
highest_dict_name, highest_value = None, -1
for dict_name, counter in (('ac', ac), ('bc', bc), ('cc', cc)):
s = set(counter.values())
d = {j : [i for i in counter if counter[i] == j] for j in s}
highest_item = sorted(d.items(), key=lambda i: i[0])[-1]
if (highest_item[0] > highest_value and
key in highest_item[1]):
highest_dict_name, highest_value = dict_name, highest_item[0]
key_dict[key] = highest_dict_name
print('For each key, dictionary were it has highest value:')
for key in sorted(key_dict):
print(' {!r} in {!r}'.format(key, key_dict[key]))
输出:
ac: Counter({'John': 3, 'Mark': 3, 'Joe': 1})
bc: Counter({'Joe': 3, 'Jack': 1, 'John': 1, 'Mark': 1})
cc: Counter({'Jack': 2, 'Tony': 1, 'Joe': 1, 'Mark': 1})
For each key, dictionary were it has highest value:
'Jack' in 'cc'
'Joe' in 'bc'
'John' in 'ac'
'Mark' in 'ac'
'Tony' in None