Pandas 没有一个热编码数据

Question

此代码：

df = pd.DataFrame({ 'id_val' : [1.0 , 2.0, 3.0] , 'c1': [1.0 , 2.0, 3.0], 'c2': [1.0 , 2.0, 3.0], 'c3': [1.0 , 2.0, 3.0] })
df

生成数据帧：

正在尝试使用 get_dummies 单热编码此数据帧：

pd.get_dummies(df)

呈现相同的数据帧：

如何使用 get_dummies 对数据帧进行一次热编码？

Answer 1

将值转换为 strings:

print (pd.get_dummies(df.astype(str)))
   id_val_1.0  id_val_2.0  id_val_3.0  c1_1.0  c1_2.0  c1_3.0  c2_1.0  c2_2.0  \
0           1           0           0       1       0       0       1       0   
1           0           1           0       0       1       0       0       1   
2           0           0           1       0       0       1       0       0   

   c2_3.0  c3_1.0  c3_2.0  c3_3.0  
0       0       1       0       0  
1       0       0       1       0  
2       1       0       0       1

如果只需要 c 列的虚拟对象：

df1 = (pd.get_dummies(df.filter(like='c').astype(str), prefix_sep='', prefix='')
         .max(axis=1, level=0))
print (df1)
   1.0  2.0  3.0
0    1    0    0
1    0    1    0
2    0    0    1

Pandas 没有一个热编码数据

Pandas not one hot encoding data

python

pandas