ValueError: could not convert string to float: 'what' (Sklearn), How to use the labelencoder?

Question

我有两个训练集输入和输出集

X = df['First Word']

y = df['Answers']

当我尝试时：

from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier()
model.fit(X,y)
predictions = model.predict(['how'])

我收到错误：

ValueError: could not convert string to float: 'what'

错误是指str()不能传递给fit()方法。

在这种情况下如何使用 LabelEncoder 使上面的代码有效？

Answer 1

所有 ML 模型都需要以数字形式输入，因此您需要根据需要使用标签编码器或单热编码对输入数据进行编码。

您可以使用以下代码对您的数据帧进行编码

 from sklearn import preprocessing
 le = preprocessing.LabelEncoder()
 X = le.fit_transform(X)

编码传递给模型后，我希望你不会得到那个错误

python