如何比循环更快地生成嵌套字典?
How to generate nested dictionary faster than with loops?
我想知道如何比使用简单循环更快地创建嵌套字典(这对于大量数据来说可能很麻烦)。我曾经做过以下事情:
list_1 = {1, 2, 3, 4, 5, 6, ....}
list_2 = {"a", "b", "c", "d", "e", "f", ....}
list_3 = {10, 20, 30, 40, 50, 60 ....}
for element_1 in list_1:
dico[element_1] = {}
for element_2 in list_2:
dico[element_1][element_2]={}
for element_3 in list_3:
dico[element_1][element_2][element_3] = {}
问题是,我认为如果有多个步骤和大量数据,它可能真的很慢...
谢谢
试试这个嵌套字典理解:
print({i: {x: {y: {} for y in list_3} for x in list_2} for i in list_1})
输出:
{1: {'d': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'c': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'a': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'e': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'b': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'f': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}}, 2: {'d': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'c': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'a': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'e': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'b': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'f': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}}, 3: {'d': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'c': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'a': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'e': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'b': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'f': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}}, 4: {'d': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'c': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'a': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'e': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'b': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'f': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}}, 5: {'d': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'c': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'a': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'e': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'b': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'f': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}}, 6: {'d': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'c': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'a': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'e': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'b': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'f': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}}}
仍然有循环但更快更短:
dico = {}
for element_1 in list_1:
dico1 = dico[element_1] = {}
for element_2 in list_2:
dico2 = dico1[element_2]={}
for element_3 in list_3:
dico2[element_3] = {}
基准测试结果:
Round 1 Round 2 Round 3
59 ms 58 ms 55 ms original
40 ms 41 ms 39 ms optimized
46 ms 44 ms 43 ms U11
基准代码(Try it online!):
from timeit import timeit
list_1 = {1, 2, 3, 4, 5, 6, ...}
list_2 = {"a", "b", "c", "d", "e", "f", ...}
list_3 = {10, 20, 30, 40, 50, 60, ...}
def original():
dico = {}
for element_1 in list_1:
dico[element_1] = {}
for element_2 in list_2:
dico[element_1][element_2]={}
for element_3 in list_3:
dico[element_1][element_2][element_3] = {}
return dico
def optimized():
dico = {}
for element_1 in list_1:
dico1 = dico[element_1] = {}
for element_2 in list_2:
dico2 = dico1[element_2]={}
for element_3 in list_3:
dico2[element_3] = {}
return dico
def U11():
return {i: {x: {y: {} for y in list_3} for x in list_2} for i in list_1}
# config
funcs = original, optimized, U11
number = 1000
# correctness
expect = original()
for func in funcs:
result = func()
print(result == expect, func.__name__)
print()
# speed
tss = [[] for _ in funcs]
for r in range(1, 4):
print(*(f'Round {i} ' for i in range(1, r+1)))
for func, ts in zip(funcs, tss):
t = timeit(func, number=number) / number
ts.append(t)
print(*('%4d ms ' % (t * 1e6) for t in ts), func.__name__)
print()
我想知道如何比使用简单循环更快地创建嵌套字典(这对于大量数据来说可能很麻烦)。我曾经做过以下事情:
list_1 = {1, 2, 3, 4, 5, 6, ....}
list_2 = {"a", "b", "c", "d", "e", "f", ....}
list_3 = {10, 20, 30, 40, 50, 60 ....}
for element_1 in list_1:
dico[element_1] = {}
for element_2 in list_2:
dico[element_1][element_2]={}
for element_3 in list_3:
dico[element_1][element_2][element_3] = {}
问题是,我认为如果有多个步骤和大量数据,它可能真的很慢...
谢谢
试试这个嵌套字典理解:
print({i: {x: {y: {} for y in list_3} for x in list_2} for i in list_1})
输出:
{1: {'d': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'c': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'a': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'e': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'b': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'f': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}}, 2: {'d': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'c': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'a': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'e': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'b': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'f': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}}, 3: {'d': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'c': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'a': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'e': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'b': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'f': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}}, 4: {'d': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'c': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'a': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'e': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'b': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'f': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}}, 5: {'d': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'c': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'a': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'e': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'b': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'f': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}}, 6: {'d': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'c': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'a': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'e': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'b': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}, 'f': {40: {}, 10: {}, 50: {}, 20: {}, 60: {}, 30: {}}}}
仍然有循环但更快更短:
dico = {}
for element_1 in list_1:
dico1 = dico[element_1] = {}
for element_2 in list_2:
dico2 = dico1[element_2]={}
for element_3 in list_3:
dico2[element_3] = {}
基准测试结果:
Round 1 Round 2 Round 3
59 ms 58 ms 55 ms original
40 ms 41 ms 39 ms optimized
46 ms 44 ms 43 ms U11
基准代码(Try it online!):
from timeit import timeit
list_1 = {1, 2, 3, 4, 5, 6, ...}
list_2 = {"a", "b", "c", "d", "e", "f", ...}
list_3 = {10, 20, 30, 40, 50, 60, ...}
def original():
dico = {}
for element_1 in list_1:
dico[element_1] = {}
for element_2 in list_2:
dico[element_1][element_2]={}
for element_3 in list_3:
dico[element_1][element_2][element_3] = {}
return dico
def optimized():
dico = {}
for element_1 in list_1:
dico1 = dico[element_1] = {}
for element_2 in list_2:
dico2 = dico1[element_2]={}
for element_3 in list_3:
dico2[element_3] = {}
return dico
def U11():
return {i: {x: {y: {} for y in list_3} for x in list_2} for i in list_1}
# config
funcs = original, optimized, U11
number = 1000
# correctness
expect = original()
for func in funcs:
result = func()
print(result == expect, func.__name__)
print()
# speed
tss = [[] for _ in funcs]
for r in range(1, 4):
print(*(f'Round {i} ' for i in range(1, r+1)))
for func, ts in zip(funcs, tss):
t = timeit(func, number=number) / number
ts.append(t)
print(*('%4d ms ' % (t * 1e6) for t in ts), func.__name__)
print()