编码目标列只显示一个类别?

Encoded target column shows only one category?

我正在处理多class class化问题。我的目标列有 4 class 个,分别是低、中、高和非常高。当我尝试对其进行编码时,我只得到 0 作为 value_counts()。我不确定,为什么。

value count in original data frame is :
High         18767
Very High    15856
Medium        9212
Low           5067
Name: physician_segment, dtype: int64

我尝试了以下方法来编码我的目标列:

Using replace() method :

target_enc = {'Low':0,'Medium':1,'High':2,'Very High':3}
df1['physician_segment'] = df1['physician_segment'].astype(object)
df1['physician_segment'] = df1['physician_segment'].replace(target_enc)
df1['physician_segment'].value_counts()
0    48902
Name: physician_segment, dtype: int64

using factorize method():
from pandas.api.types import CategoricalDtype 
df1['physician_segment'] = df1['physician_segment'].factorize()[0]
df1['physician_segment'].value_counts()
0    48902
Name: physician_segment, dtype: int64

Using Label Encoder :
from sklearn import preprocessing
labelencoder= LabelEncoder() 
df1['physician_segment'] = labelencoder.fit_transform(df1['physician_segment']) df1['physician_segment'].value_counts()
0    48902
Name: physician_segment, dtype: int64

在所有这三种技术中,我只得到一种 class 作为 0,数据帧的长度是 48902

有人可以指出我做错了什么吗? 我希望我的目标列的值为 0, 1, 2, 3.

target_enc = {'Low':0,'Medium':1,'High':2,'Very High':3}
df1['physician_segment'] = df1['physician_segment'].astype(object)

之后create/define一个函数:-

def func(val):
    if val in target_enc.keys():
        return target_enc[val]

最后使用apply()方法:-

df1['physician_segment']=df1['physician_segment'].apply(func)

现在如果你打印 df1['physician_segment'].value_counts() 你会得到正确的输出