从 python 中的字典系列中选择不同的键及其计数

Selecting distinct keys and their counts from a dictionary series in python

我有一个 pandas 字典系列,它采用

这样的值
   0 {AA:25,BB:31}
   1 {CC:45,AA:3}
   2 {BB:3,CD:4,AA:5}

我想根据键和它的连续出现从中创建一个字典,比如:

{AA:3,BB:2,CC:1,CD:1}

我怀疑是否有 "built-in" 解决方案,因此您必须手动迭代并计算每个字典中的每个键。

import pandas as pd
from collections import defaultdict

ser = pd.Series([{'AA':25,'BB':31},
                 {'CC':45,'AA':3},
                 {'BB':3,'CD':4,'AA':5}])

count = defaultdict(int)

for d in ser:
    for key in d:
        count[key] += 1

print(count)
# defaultdict(<class 'int'>, {'CC': 1, 'BB': 2, 'AA': 3, 'CD': 1})

您也可以使用 Counter,但在这种情况下看起来 "forced":

import pandas as pd
from collections import Counter

total = Counter()

ser = pd.Series([{'AA':25,'BB':31},
                 {'CC':45,'AA':3},
                 {'BB':3,'CD':4,'AA':5}])

for d in ser:
    total.update(d.keys())

print(total)
# Counter({'AA': 3, 'BB': 2, 'CD': 1, 'CC': 1})
counter = dict()
for item in series:
    for key in item:
       counter[key] = counter.get(key, 0) + 1

将您的系列转换为一系列键列表,对创建单个键列表的那些进行求和,然后使用 Counter:

In [23]: pd.Series([{'AA':25,'BB':31},{'CC':45,'AA':3},{'BB':3,'CD':4,'AA':5}])
Out[23]: 
0           {'AA': 25, 'BB': 31}
1            {'AA': 3, 'CC': 45}
2    {'CD': 4, 'AA': 5, 'BB': 3}
dtype: object

In [24]: series = _

In [34]: from collections import Counter

In [35]: Counter(series.apply(lambda x: list(x.keys())).sum())
Out[35]: Counter({'AA': 3, 'BB': 2, 'CC': 1, 'CD': 1})

或者使用生成器表达式和扁平化:

In [37]: Counter(k for d in series for k in d.keys())
Out[37]: Counter({'AA': 3, 'BB': 2, 'CC': 1, 'CD': 1})

也许有点晚了,但这是使用 pandas 内置函数的另一种方法。

s = pd.Series([{'AA':25,'BB':31},
                 {'CC':45,'AA':3},
                 {'BB':3,'CD':4,'AA':5}])


#convert dict to a dataframe and count non nan elements and finally convert it to a dict.    
s.apply(pd.Series).count().to_dict()
Out[651]: {'AA': 3, 'BB': 2, 'CC': 1, 'CD': 1}