K-最近邻算法给出不适合的错误

K-nearest neighbour algorithm is giving notfitted error

K-最近邻 我正在尝试对心脏病预测数据库执行 knn 算法。当我尝试选择它并创建 model.pkl 时,它给了我 notfitted 错误。当我是 运行 代码时,它会给我准确的预测,但是当 pickel 时,它会显示错误。我应该如何拟合这些数据。我是机器学习的新手,所以请帮忙。

from sklearn.neighbors import KNeighborsClassifier

 dataset = pd.get_dummies(df, columns = ['sex', 'cp', 'fbs', 'restecg', 'exang', 'slope', 'ca', 'thal'])

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
standardScaler = StandardScaler()
columns_to_scale = ['age', 'trestbps', 'chol', 'thalach', 'oldpeak']
dataset[columns_to_scale] = standardScaler.fit_transform(dataset[columns_to_scale])
y = dataset['target']
X = dataset.drop(['target'], axis = 1)
from sklearn.model_selection import cross_val_score
knn_scores = []
for k in range(1,21):
    knn_classifier = KNeighborsClassifier(n_neighbors = k)
    score=cross_val_score(knn_classifier,X,y,cv=10)
    knn_scores.append(score.mean())
plt.plot([k for k in range(1, 21)], knn_scores, color = 'red')
for i in range(1,21):
    plt.text(i, knn_scores[i-1], (i, knn_scores[i-1]))
plt.xticks([i for i in range(1, 21)])
plt.xlabel('Number of Neighbors (K)')
plt.ylabel('Scores')
plt.title('K Neighbors Classifier scores for different K values')
Text(0.5, 1.0, 'K Neighbors Classifier scores for different K values')

knn_classifier
knn_classifier = KNeighborsClassifier(n_neighbors = 12)
score=cross_val_score(knn_classifier,X,y,cv=10)
score.mean()
0.8448387096774195
import pickle
pickle.dump(knn_classifier, open('model.pkl', 'wb'))
Heart_disease_detector_model = pickle.load(open('model.pkl', 'rb'))
y_pred = Heart_disease_detector_model.predict(X_test)
print('Accuracy of K – Nearest Neighbor  model = ',accuracy_score(y_test, y_pred))
---------------------------------------------------------------------------

>     NotFittedError                            Traceback (most recent call last)
>     <ipython-input-79-c37bd716088c> in <module>
>           2 pickle.dump(knn_classifier, open('model.pkl', 'wb'))
>           3 Heart_disease_detector_model = pickle.load(open('model.pkl', 'rb'))
>     ----> 4 y_pred = Heart_disease_detector_model.predict(X_test)
>           5 print('Accuracy of K – Nearest Neighbor  model = ',accuracy_score(y_test, y_pred))
>     
>     c:\users\jahnavi padala\miniconda3\lib\site-packages\sklearn\neighbors\_classification.py
> in predict(self, X)
>         195         X = check_array(X, accept_sparse='csr')
>         196 
>     --> 197         neigh_dist, neigh_ind = self.kneighbors(X)
>         198         classes_ = self.classes_
>         199         _y = self._y
>     
>     c:\users\jahnavi padala\miniconda3\lib\site-packages\sklearn\neighbors\_base.py in
> kneighbors(self, X, n_neighbors, return_distance)
>         647                [2]]...)
>         648         """
>     --> 649         check_is_fitted(self)
>         650 
>         651         if n_neighbors is None:
>     
>     c:\users\jahnavi padala\miniconda3\lib\site-packages\sklearn\utils\validation.py in
> inner_f(*args, **kwargs)
>          61             extra_args = len(args) - len(all_args)
>          62             if extra_args <= 0:
>     ---> 63                 return f(*args, **kwargs)
>          64 
>          65             # extra_args > 0
>     
>     c:\users\jahnavi padala\miniconda3\lib\site-packages\sklearn\utils\validation.py in
> check_is_fitted(estimator, attributes, msg, all_or_any)
>        1096 
>        1097     if not attrs:
>     -> 1098         raise NotFittedError(msg % {'name': type(estimator).__name__})
>        1099 
>        1100 
>     
>     NotFittedError: This KNeighborsClassifier instance is not fitted yet. Call 'fit' with appropriate arguments before using this
> estimator.

你不能在不拟合模型的情况下创建泡菜。在行 pickle.dump(knn_classifier, open('model.pkl', 'wb')) 之前写 knn_classifier.fit(*your_X, your_Y*)

错误告诉您分类器尚未拟合,这正是它听起来的样子——您需要在使用模型之前拟合。在获得准确度分数之前做这样的事情:

knn_classifier.fit(X, y)

所以你最终会得到这个:

knn_classifier
knn_classifier = KNeighborsClassifier(n_neighbors = 12)
knn_classifier.fit(X, y)