如何将数据框转换为计数数组(基于列)

How to convert a dataframe to an array of counts (based on a column)

我有以下数据框

Line    emotion
0   d...    anger
1   a ...   shame
2   b...    sadness
3   c...    joy
4   d...    shame
... ... ...
117 f...    joy
118 g...    disgust
119 h...    disgust
120 i...    fear
121 j   anger

而我需要用这样的情绪画一个雷达:

import numpy as np
import matplotlib.pyplot as plt

categories = ['Joy', 'Fear', 'Anger', 'Sadness', 'Disgust', 'Shame','Guilt']
q1 = [4, 4, 5, 4, 3, 7, 10]
label_loc = np.linspace(start=0, stop=2*np.pi, num=len(q1)+1)[:-1]
plt.figure(figsize=(8, 8))
plt.subplot(polar=True)
plt.plot(label_loc, q1, label='q 1')
plt.title('Answer to Question 1 - Emotion Analysis', size=20)
lines, labels = plt.thetagrids(np.degrees(label_loc), labels=categories)
plt.legend()
plt.show()

结果是:

我的问题是,如何轻松地将 pandas 数据框转换为每种特定情绪的计数数组:

 q1 = [4, 4, 5, 4, 3, 7, 10]

其中每个数字代表这些情绪:

categories = ['Joy', 'Fear', 'Anger', 'Sadness', 'Disgust', 'Shame','Guilt']

使用 Series.value_counts 将值和索引转换为列表:

s = df['emotion'].value_counts()
q1 = s.to_list()
categories = s.index.tolist()
print (q1)
[2, 2, 2, 2, 1, 1]

print (categories)
['joy', 'anger', 'shame', 'disgust', 'sadness', 'fear']

如果排序很重要,请将值转换为小写并添加 Series.reindex:

categories = ['Joy', 'Fear', 'Anger', 'Sadness', 'Disgust', 'Shame','Guilt']

cats = [x.lower() for x in categories]
q1 = df['emotion'].value_counts().reindex(cats, fill_value=0).tolist()
print (q1)
[2, 1, 2, 1, 2, 2, 0]