什么数据结构像字典一样支持每个键多个值?

What data structure acts like a dictionary and supports multuiple values per key?

我有球员和球衣数据。球员姓名是键,队服号码是值。一名球员在其职业生涯中可能会穿多个球衣号码。

{"Michael Jordan":45}
{"Michael Jordan":23}

这行不通,因为第二个条目替换了第一个:

myData = {}
myData["Michael Jordan"] = 45
myData["Michael Jordan"] = 23

我无法将其存储在字典中。应该使用什么数据结构?

列表字典或集合字典是一种非常常见的结构,每个键有多个值。

data = [('Michael Jordan', 45), ('Kobe Bryant', 8), ('Michael Jordan', 23), ('Kobe Bryant', 24), ('Michael Jordan', 23)]

d = {}
for player, uniform in data:
    if player not in d:
        d[player] = [uniform]
    else:
        d[player].append(uniform)

我个人不喜欢那个循环中的 if / else。您可以使用默认值 dict.get 或使用 dict.setdefault 或使用 defaultdict:

来摆脱它
# 1ST METHOD: DICT.GET WITH DEFAULT VALUE
d = {}
for player, uniform in data:
    d[player] = d.get(player, [])
    d[player].append(uniform)

# 2ND METHOD: DICT.SETDEFAULT
d = {}
for player, uniform in data:
    d.setdefault(player, []).append(uniform)

# 3RD METHOD: DEFAULTDICT
from collections import defaultdict
d = defaultdict(list)
for player, uniform in data:
    d[player].append(uniform)

在所有情况下,结果都是一样的:

# 1st and 2nd methods
print(d)
# {'Michael Jordan': [45, 23, 23], 'Kobe Bryant': [8, 24]}

# 3rd method
print(d)
# defaultdict(<class 'list'>, {'Michael Jordan': [45, 23, 23], 'Kobe Bryant': [8, 24]})

但是,也许我们不喜欢迈克尔·乔丹球衣中的这种重复值 23。如果我们不关心值的顺序,但我们关心去除重复项,那么我们需要的 go-to 数据结构是 set,而不是 list

我们的 4 种可能方法变为:

d = {}
for player, uniform in data:
    if player not in d:
        d[player] = {uniform}
    else:
        d[player].add(uniform)

# 1ST METHOD: DICT.GET WITH DEFAULT VALUE
d = {}
for player, uniform in data:
    d[player] = d.get(player, set())
    d[player].add(uniform)

# 2ND METHOD: DICT.SETDEFAULT
d = {}
for player, uniform in data:
    d.setdefault(player, set()).add(uniform)

# 3RD METHOD: DEFAULTDICT
from collections import defaultdict
d = defaultdict(set)
for player, uniform in data:
    d[player].add(uniform)

# RESULT

# 1st and 2nd methods
print(d)
# {'Michael Jordan': {45, 23}, 'Kobe Bryant': {8, 24}}

# 3rd method
print(d)
# defaultdict(<class 'set'>, {'Michael Jordan': {45, 23}, 'Kobe Bryant': {8, 24}})

最后,我想再展示一个方法,就是使用function map_reduce from module more_itertools:

from operator import itemgetter
from more_itertools import map_reduce

# DICT OF LIST
d = map_reduce(data, keyfunc=itemgetter(0), valuefunc=itemgetter(1))
print(d)
# defaultdict(None, {'Michael Jordan': [45, 23, 23], 'Kobe Bryant': [8, 24]})

# DICT OF SET
d = map_reduce(data, keyfunc=itemgetter(0), valuefunc=itemgetter(1), reducefunc=set)
print(d)
# defaultdict(None, {'Michael Jordan': {45, 23}, 'Kobe Bryant': {8, 24}})