如何在 python 中创建具有重复键的嵌套字典
How to create nested dictionaries with duplicate keys in python
我想创建具有嵌套字典和重复键的数据结构。一个详细的例子是:
data['State1']['Landon']['abc Area'] = 'BOB'
data['State1']['Landon']['abc Area'] = 'SAM'
data['State1']['Landon']['xyz Area'] = 'John'
data['State2']['New York']['hjk Area'] = 'Ricky'
for z in data['State1'].keys() ,
# I should get list ['Landon', 'Landon', 'Landon']
for y in data['State1']['Landon'].keys() ,
# I should get list ['abc Area', 'abc Area', 'xyz Area']
目前为了存储数据我使用了额外的计数器键
data = Autovivification()
data[state][city][area][counter] = ID
但是在解析 City/Area 的总条目(以及重复项)时,我必须使用嵌套循环直到计数器键。
for city in data['State1'].keys():
for area in data['State1'][city].keys():
for counter in data['State1'][city][area].keys():
for temp in data['State1'][city][area][counter].values():
cityList.append(city)
areaList.append(area)
对于嵌套字典,我找到了 nosklo 发布的以下代码
class AutoVivification(dict):
"""Implementation of perl's autovivification feature."""
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
value = self[item] = type(self)()
return value
对于具有重复键的字典,我发现了 Scorpil
发布的代码
class Dictlist(dict):
def __setitem__(self, key, value):
try:
self[key]
except KeyError:
super(Dictlist, self).__setitem__(key, [])
self[key].append(value)
如何合并 Autovivification 和 Duplicate class 代码?还是有任何其他 pythonic 方式来处理这种情况?
一种简单的方法是将其制成列表,然后将每个新键添加到列表中:
Data['State']['City']['Area'] = []
Data['State']['City']['Area'].append( ID )
您可以将 AutoVivication
class 替换为自动激活 Dictlists
而不是 dicts
:
class AutoVivificationDL(Dictlist):
"""Implementation of perl's autovivification feature."""
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
value = self[item] = type(self)()
return value
Data = {}
values = [
dict(State="CA", City="San Francisco", Area="North", Id="customer1"),
dict(State="CA", City="San Francisco", Area="Embarcadero", Id="customer1"),
dict(State="CA", City="San Francisco", Area="North", Id="customer2"),
]
for v in values:
#grab the existing entry. if it doesn't exist, returns a list
li = Data.setdefault((v["State"],v["City"],v["Area"]),[])
li.append(v["Id"])
print "Data:%s" % (Data)
输出:
Data:{('CA', 'San Francisco', 'North'): ['customer1', 'customer2'], ('CA', 'San Francisco', 'Embarcadero'): ['customer1']}
您不限于一个非常简单的 Id 值,您可以向列表中添加几乎任何您想要的内容。如果您希望在多个位置执行此操作,请查看 https://docs.python.org/2/library/collections.html#collections.defaultdict,其中有点内置 setdefault。
事实上,您可以将 ID 添加到字典而不是列表,都是一样的。
另一个使用 defaultdict 的例子:
from collections import defaultdict
data = defaultdict( # State
lambda: defaultdict( # City
lambda: defaultdict(list) # Area
)
)
data['State']['City']['Area'].append('area 1')
data['State']['City']['Area'].append('area 2')
data['State']['City']['Area'].append('area 2')
areas = data['State']['City']['Area']
print(areas) # ['area 1', 'area 2', 'area 2']
total = len(areas)
print(total) # 3
如何使用此解决方案获取您想要的项目列表:
data['State1']['Landon']['abc Area'].append('BOB')
data['State1']['Landon']['abc Area'].append('SAM')
data['State1']['Landon']['xyz Area'].append('John')
data['State2']['New York']['hjk Area'].append('Ricky')
def items_in(d):
res = []
if isinstance(d, list):
res.extend(d)
elif isinstance(d, dict):
for k, v in d.items():
res.extend([k] * len(items_in(v)))
else:
raise ValueError('Unknown data')
return res
print(items_in(data['State1'])) # ['Landon', 'Landon', 'Landon']
print(items_in(data['State1']['Landon'])) # ['xyz Area', 'abc Area', 'abc Area']
print(items_in(data['State1']['Landon']['abc Area'])) # ['BOB', 'SAM']
print(items_in(data['State1']['Landon']['xyz Area'])) # ['John']
print(items_in(data['State2'])) # ['New York']
print(items_in(data['State2']['New York'])) # ['hjk Area']
我想创建具有嵌套字典和重复键的数据结构。一个详细的例子是:
data['State1']['Landon']['abc Area'] = 'BOB'
data['State1']['Landon']['abc Area'] = 'SAM'
data['State1']['Landon']['xyz Area'] = 'John'
data['State2']['New York']['hjk Area'] = 'Ricky'
for z in data['State1'].keys() ,
# I should get list ['Landon', 'Landon', 'Landon']
for y in data['State1']['Landon'].keys() ,
# I should get list ['abc Area', 'abc Area', 'xyz Area']
目前为了存储数据我使用了额外的计数器键
data = Autovivification()
data[state][city][area][counter] = ID
但是在解析 City/Area 的总条目(以及重复项)时,我必须使用嵌套循环直到计数器键。
for city in data['State1'].keys():
for area in data['State1'][city].keys():
for counter in data['State1'][city][area].keys():
for temp in data['State1'][city][area][counter].values():
cityList.append(city)
areaList.append(area)
对于嵌套字典,我找到了 nosklo 发布的以下代码
class AutoVivification(dict):
"""Implementation of perl's autovivification feature."""
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
value = self[item] = type(self)()
return value
对于具有重复键的字典,我发现了 Scorpil
发布的代码class Dictlist(dict):
def __setitem__(self, key, value):
try:
self[key]
except KeyError:
super(Dictlist, self).__setitem__(key, [])
self[key].append(value)
如何合并 Autovivification 和 Duplicate class 代码?还是有任何其他 pythonic 方式来处理这种情况?
一种简单的方法是将其制成列表,然后将每个新键添加到列表中:
Data['State']['City']['Area'] = []
Data['State']['City']['Area'].append( ID )
您可以将 AutoVivication
class 替换为自动激活 Dictlists
而不是 dicts
:
class AutoVivificationDL(Dictlist):
"""Implementation of perl's autovivification feature."""
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
value = self[item] = type(self)()
return value
Data = {}
values = [
dict(State="CA", City="San Francisco", Area="North", Id="customer1"),
dict(State="CA", City="San Francisco", Area="Embarcadero", Id="customer1"),
dict(State="CA", City="San Francisco", Area="North", Id="customer2"),
]
for v in values:
#grab the existing entry. if it doesn't exist, returns a list
li = Data.setdefault((v["State"],v["City"],v["Area"]),[])
li.append(v["Id"])
print "Data:%s" % (Data)
输出:
Data:{('CA', 'San Francisco', 'North'): ['customer1', 'customer2'], ('CA', 'San Francisco', 'Embarcadero'): ['customer1']}
您不限于一个非常简单的 Id 值,您可以向列表中添加几乎任何您想要的内容。如果您希望在多个位置执行此操作,请查看 https://docs.python.org/2/library/collections.html#collections.defaultdict,其中有点内置 setdefault。
事实上,您可以将 ID 添加到字典而不是列表,都是一样的。
另一个使用 defaultdict 的例子:
from collections import defaultdict
data = defaultdict( # State
lambda: defaultdict( # City
lambda: defaultdict(list) # Area
)
)
data['State']['City']['Area'].append('area 1')
data['State']['City']['Area'].append('area 2')
data['State']['City']['Area'].append('area 2')
areas = data['State']['City']['Area']
print(areas) # ['area 1', 'area 2', 'area 2']
total = len(areas)
print(total) # 3
如何使用此解决方案获取您想要的项目列表:
data['State1']['Landon']['abc Area'].append('BOB')
data['State1']['Landon']['abc Area'].append('SAM')
data['State1']['Landon']['xyz Area'].append('John')
data['State2']['New York']['hjk Area'].append('Ricky')
def items_in(d):
res = []
if isinstance(d, list):
res.extend(d)
elif isinstance(d, dict):
for k, v in d.items():
res.extend([k] * len(items_in(v)))
else:
raise ValueError('Unknown data')
return res
print(items_in(data['State1'])) # ['Landon', 'Landon', 'Landon']
print(items_in(data['State1']['Landon'])) # ['xyz Area', 'abc Area', 'abc Area']
print(items_in(data['State1']['Landon']['abc Area'])) # ['BOB', 'SAM']
print(items_in(data['State1']['Landon']['xyz Area'])) # ['John']
print(items_in(data['State2'])) # ['New York']
print(items_in(data['State2']['New York'])) # ['hjk Area']