使用 apply 填充字典

Question

我有一个包含 python 标签列表的数据框列。我需要创建一个字典来计算标签被使用的次数。我是这样做的：

tags_use_count = {}

def count_tags(tag_list):
    
    for tag in tag_list:
        if tag in tags_use_count:
            tags_use_count[tag] += 1
        else:
            tags_use_count[tag] = 1

q2019['Tags'].apply(count_tags)

它工作得很好，但我想知道这是否是一个好的方法。不知何故，以这种方式使用 apply 似乎是经验丰富的编码人员不喜欢的蹩脚解决方法。（我猜这不是 apply 的目的。）数据集很小，所以我想我可以使用 iterrows 循环遍历该列，但我知道这对于较大的数据集不是一个好主意，我想知道我的方法是否会是在这种情况下，或者如果有更好的方法。

Answer 1

IIUC，您只想计算每一行中的每个列表。所以你可以展开 'Tags'-column 并计算值并转换为字典：

q2019['Tags'].explode().value_counts().to_dict()

Answer 2

您可以使用 collections.Counter 来做到这一点：

>>> from collections import Counter
>>> tag_list = ['tag_a', 'tag_b', 'tag_b', 'tag_c']
>>> dict(Counter(tag_list))
{ 'tag_a': 1, 'tag_b': 2, 'tag_c': 1}

使用 apply 填充字典

Using apply to populate a dictionary

python

dictionary

apply

pandas