如何在多个字典中获取具有最高值的每个键的字典名称

Question

我有一个名为 AllNames 的唯一名称列表和几个子列表，其中这些名称可以重复，或者根本找不到。例如

AllNames = ['John', 'Mark', 'Tony', 'Bob', 'Jack']

blue = ['John', 'John', 'Mark', 'Jack']
green = ['Mark', 'Mark', 'Jack']
red = ['Bob', 'Jack']
# These are dictionaries with the counts for each list
blueCounter = Counter({'John': 2, 'Mark': 1, 'Jack': 1})
greenCounter = Counter({'Mark': 2, 'Jack': 1})
redCounter = Counter({'Bob': 1, 'Jack': 1})

对于 AllNames 中的每个 name，我想获取 name 中计数最高的列表的名称，并将其放入新列表中。基于以上，我会得到：

blueList = ['John'] # since John has the highest count across all lists
greenList = ['Mark']
redList =['Bob']

请注意，对于 'Jack'，谁在所有三个列表中出现的次数相同，我不希望他的名字出现在任何地方。同样，'Tony'，其名称未出现在任何列表中，不应包括在内。

我正在尝试应用 stats.iteritems 但这在一本字典中获得了最高值。

编辑：另一个例子

a=['John', 'John', 'John', 'Mark', 'Mark', 'Mark', 'Joe']
b= ['John', 'Mark', 'Joe', 'Joe', 'Joe', 'Jack']
c= ['Mark', 'Joe', 'Jack', 'Jack', 'Tony']

ac = Counter(a)
bc = Counter(b)
cc = Counter(c)

# >>> ac
# Counter({'John': 3, 'Mark': 3, 'Joe': 1})
# >>> bc
# Counter({'Joe': 3, 'Jack': 1, 'John': 1, 'Mark': 1})
# >>> cc
# Counter({'Jack': 2, 'Tony': 1, 'Joe': 1, 'Mark': 1})

结果应该是：

alist = ['John', 'Mark']
blist = ['Joe']
clist = ['Jack', 'Tony']

Answer 1

你甚至不需要AllNames:

from collections import Counter

lists = { 'blue': ['John', 'John', 'Mark', 'Jack'], 'green': ['Mark', 'Mark', 'Jack'], 'red': ['Bob', 'Jack']}

for li in lists:
    print li, Counter(lists[li]).most_common()[0]

>> blue ('John', 2)
   green ('Mark', 2)
   red ('Bob', 1)

如果您想创建一个从列表名称到最常见名称的字典，只需执行以下操作：

most_common_dict = {}
for li in lists:
    most_common_dict[li] = Counter(lists[li]).most_common()[0][0]
>>  {'blue': 'John', 'green': 'Mark', 'red': 'Bob'}

如果您使用的是 Python 2.7 或更高版本，则在 1 行中：

most_common_dict = { li: Counter(lists[li]).most_common()[0][0] for li in lists }

Answer 2

像这样：

from collections import Counter

AllNames = ['John', 'Mark', 'Tony', 'Bob', 'Jack']

blue = ['John', 'John', 'Mark', 'Jack']
green = ['Mark', 'Mark', 'Jack']
red = ['Bob', 'Jack']

# These are dictionaries with the counts for each list
blueCounter = Counter({'John': 2, 'Mark': 1, 'Jack': 1})
greenCounter = Counter({'Mark': 2, 'Jack': 1})
redCounter = Counter({'Bob': 1, 'Jack': 1})

names = []
for counter in blueCounter, greenCounter, redCounter:
    name_with_highest_count = counter.most_common()[0][0]
    if name_with_highest_count in AllNames:
        names.append([name_with_highest_count])

blueList, greenList, redList = names

print(blueList)  # -> ['John']
print(greenList) # -> ['Mark']
print(redList)   # -> ['Bob']

Answer 3

应该这样做

import numpy as np
from collections import Counter

a=['John', 'John', 'John', 'Mark', 'Mark', 'Mark', 'Joe']
b= ['John', 'Mark', 'Joe', 'Joe', 'Joe', 'Jack']
c= ['Mark', 'Joe', 'Jack', 'Jack', 'Tony']

ac = Counter(a)
bc = Counter(b)
cc = Counter(c)


allLists = [list(), list(), list()]

allNames = set(ac.keys() + bc.keys() + cc.keys());

for name in allNames:

    aCount = ac[name];
    bCount = bc[name];
    cCount = cc[name];

    allCounts = np.array([aCount, bCount, cCount]);

    maxIndex = allCounts.argsort()[::-1][0];

    allLists[maxIndex] += [name];


alist, blist, clist = allLists[:]

print alist, blist, clist

Answer 4

正如我在下的评论中所说，您已经更改了问题。除此之外，现在有 3 个明确命名的结果列表 — alist、blist 和 clist — 并且还给出 "for each key in the union of all dictionaries, the name of the dictionary where that key has the highest value" 是没有意义的。

后面引用的部分意味着每个键都可以有一个字典名称，并且由于总共有超过 3 个键，因此可能有超过 3 个 "results"。

这里尝试忽略这种不一致并实现您在引用中所说的内容：

from collections import Counter

a = ['John', 'John', 'John', 'Mark', 'Mark', 'Mark', 'Joe']
b = ['John', 'Mark', 'Joe', 'Joe', 'Joe', 'Jack']
c = ['Mark', 'Joe', 'Jack', 'Jack', 'Tony']

ac = Counter(a)
bc = Counter(b)
cc = Counter(c)

print('ac: {}'.format(ac))
print('bc: {}'.format(bc))
print('cc: {}'.format(cc))
print('')

key_dict = {}  # maps key to name of dictionary where it has highest value

for key in set(a+b+c):  # set(a+b+c) == all keys
    highest_dict_name, highest_value = None, -1

    for dict_name, counter in (('ac', ac), ('bc', bc), ('cc', cc)):
        s = set(counter.values())
        d = {j : [i for i in counter if counter[i] == j] for j in s}
        highest_item = sorted(d.items(), key=lambda i: i[0])[-1]
        if (highest_item[0] > highest_value and
            key in highest_item[1]):
            highest_dict_name, highest_value = dict_name, highest_item[0]

    key_dict[key] = highest_dict_name

print('For each key, dictionary were it has highest value:')
for key in sorted(key_dict):
    print('  {!r} in {!r}'.format(key, key_dict[key]))

输出：

ac: Counter({'John': 3, 'Mark': 3, 'Joe': 1})
bc: Counter({'Joe': 3, 'Jack': 1, 'John': 1, 'Mark': 1})
cc: Counter({'Jack': 2, 'Tony': 1, 'Joe': 1, 'Mark': 1})

For each key, dictionary were it has highest value:
  'Jack' in 'cc'
  'Joe' in 'bc'
  'John' in 'ac'
  'Mark' in 'ac'
  'Tony' in None

如何在多个字典中获取具有最高值的每个键的字典名称

How to get the dictionary name for each key with the highest value across multiple dictionaries

python

counter

dictionary