如何将具有 n 个类别的一列转换为 n 个二进制值列？

Question

我有以下数据框：

id	gender	name	...	status
1	M	John	...	Withdrawn
2	F	Mary	...	Pass
...	...	...	...	...
10	F	Kate	...	Fail

我想将其转换成这样的数据框：

id	gender	name	...	Withdrawn	Pass	Fail
1	M	John	...	1	0	0
2	F	Mary	...	0	1	0
...	...	...	...	...	...	...
10	F	Kate	...	0	0	1

使用 pivot_table 之类的函数是否可以实现类似的功能，或者是否有必要编写一个函数然后遍历每一行并将值附加到相应的列？

Answer 1

就像使用虚拟变量一样简单：

df = pd.get_dummies(df, columns=['status'])
df = df.drop(columns = ['status'])

Answer 2

在您删除 'status' 列的原始数据框中使用 pandas.get_dummies 和 join：

df.drop(columns='status').join(pd.get_dummies(df['status']))

输出：

   id  gender  name    Fail  Pass  Withdrawn
0    1      M   John      0     0          1
1    2      F   Mary      0     1          0
2   10      F   Kate      1     0          0

如何将具有 n 个类别的一列转换为 n 个二进制值列？

How to pivot one colum with n categories into n binary values column?

python

pivot-table

dataframe

data-transform