如何在 defaultdict 中存储多条信息
How to store several pieces of info in a defaultdict
我有一个列表,我正在迭代以计算一些组合,我想存储一些超出计算结果的信息。 Counter
或 defaultdict
适用于计数,但我不确定如何添加辅助信息。例如,如果我在我的 'list_to_count'
列中汇总所有长度为 2 的列表,我可以这样做:
import pandas as pd
from itertools import combinations
from collections import defaultdict
mydf = pd.DataFrame({'auxinfo': ['first', 'second', 'third'], 'list_to_count': [['apple', 'banana'], ['apple', 'banana', 'chicken'], ['apple']]})
print(mydf)
d=defaultdict(int)
for r in mydf.itertuples():
combos = combinations(r.list_to_count, 2)
for combo in combos:
combo_name = ','.join(sorted(combo))
d[combo_name] += 1
print(d)
这是我得到的:
auxinfo list_to_count
0 first [apple, banana]
1 second [apple, banana, chicken]
2 third [apple]
In [13]: d
Out[13]: defaultdict(int, {'apple,banana': 2, 'apple,chicken': 1, 'banana,chicken': 1})
但我还想存储 auxinfo
例如在列表中,所需的输出看起来像
{'apple,banana': (2, ['first', 'second']), 'apple,chicken': (1, ['second']), 'banana,chicken': (1, ['second'])}
defaultdict
可以像 defaultdict(tuple)
一样初始化,我可以在其中存储 (count, auxinfo_list)
的元组,但 auxinfo_list
本身不是 defaultdict
.
您可以使用 dict.get()
并将 默认值 设置为 (0, [])
。
d = {}
for r in mydf.itertuples():
combos = combinations(r.list_to_count, 2)
for combo in combos:
combo_name = ','.join(sorted(combo))
count, auxinfo_list = d.get(combo_name, (0, []))
d[combo_name] = (count + 1, auxinfo_list + [r.auxinfo])
for key, value in d.items():
print(f'{key}:\t{value}')
输出:
apple,banana: (2, ['first', 'second'])
apple,chicken: (1, ['second'])
banana,chicken: (1, ['second'])
我有一个列表,我正在迭代以计算一些组合,我想存储一些超出计算结果的信息。 Counter
或 defaultdict
适用于计数,但我不确定如何添加辅助信息。例如,如果我在我的 'list_to_count'
列中汇总所有长度为 2 的列表,我可以这样做:
import pandas as pd
from itertools import combinations
from collections import defaultdict
mydf = pd.DataFrame({'auxinfo': ['first', 'second', 'third'], 'list_to_count': [['apple', 'banana'], ['apple', 'banana', 'chicken'], ['apple']]})
print(mydf)
d=defaultdict(int)
for r in mydf.itertuples():
combos = combinations(r.list_to_count, 2)
for combo in combos:
combo_name = ','.join(sorted(combo))
d[combo_name] += 1
print(d)
这是我得到的:
auxinfo list_to_count
0 first [apple, banana]
1 second [apple, banana, chicken]
2 third [apple]
In [13]: d
Out[13]: defaultdict(int, {'apple,banana': 2, 'apple,chicken': 1, 'banana,chicken': 1})
但我还想存储 auxinfo
例如在列表中,所需的输出看起来像
{'apple,banana': (2, ['first', 'second']), 'apple,chicken': (1, ['second']), 'banana,chicken': (1, ['second'])}
defaultdict
可以像 defaultdict(tuple)
一样初始化,我可以在其中存储 (count, auxinfo_list)
的元组,但 auxinfo_list
本身不是 defaultdict
.
您可以使用 dict.get()
并将 默认值 设置为 (0, [])
。
d = {}
for r in mydf.itertuples():
combos = combinations(r.list_to_count, 2)
for combo in combos:
combo_name = ','.join(sorted(combo))
count, auxinfo_list = d.get(combo_name, (0, []))
d[combo_name] = (count + 1, auxinfo_list + [r.auxinfo])
for key, value in d.items():
print(f'{key}:\t{value}')
输出:
apple,banana: (2, ['first', 'second'])
apple,chicken: (1, ['second'])
banana,chicken: (1, ['second'])