模型的特征数量必须与输入匹配。型号n_features为7985,输入n_features为1
The number of features of the model must match the input. Model n_features is 7985 and input n_features is 1
我用随机森林构建了一个垃圾邮件分类器,想制作一个单独的函数来将短信分类为垃圾邮件或非垃圾邮件,我尝试过:
def predict_message(pred_text):
pred_text=[pred_text]
pred_text2 = tfidf_vect.fit_transform(pred_text)
pred_features = pd.DataFrame(pred_text2.toarray())
prediction = rf_model.predict(pred_features)
return (prediction)
pred_text = "how are you doing today?"
prediction = predict_message(pred_text)
print(prediction)
但它给了我错误:
The number of features of the model must match the input.
Model n_features is 7985 and input n_features is 1
我看不出问题所在,我该如何解决?
通过调用 tfidf_vect.fit_transform(pred_text)
,您的向量化器会丢失它从原始训练语料库中获得的所有信息。
你应该打电话给 transform
.
以下这些更改应该有所帮助:
def predict_message(pred_text):
pred_text=[pred_text]
pred_text2 = tfidf_vect.transform(pred_text) # Changed
prediction = rf_model.predict(pred_text2)
return (prediction)
pred_text = "how are you doing today?"
prediction = predict_message(pred_text)
print(prediction)
我用随机森林构建了一个垃圾邮件分类器,想制作一个单独的函数来将短信分类为垃圾邮件或非垃圾邮件,我尝试过:
def predict_message(pred_text):
pred_text=[pred_text]
pred_text2 = tfidf_vect.fit_transform(pred_text)
pred_features = pd.DataFrame(pred_text2.toarray())
prediction = rf_model.predict(pred_features)
return (prediction)
pred_text = "how are you doing today?"
prediction = predict_message(pred_text)
print(prediction)
但它给了我错误:
The number of features of the model must match the input.
Model n_features is 7985 and input n_features is 1
我看不出问题所在,我该如何解决?
通过调用 tfidf_vect.fit_transform(pred_text)
,您的向量化器会丢失它从原始训练语料库中获得的所有信息。
你应该打电话给 transform
.
以下这些更改应该有所帮助:
def predict_message(pred_text):
pred_text=[pred_text]
pred_text2 = tfidf_vect.transform(pred_text) # Changed
prediction = rf_model.predict(pred_text2)
return (prediction)
pred_text = "how are you doing today?"
prediction = predict_message(pred_text)
print(prediction)