K 折交叉验证的准确度排名与单个模型的准确度排名不一致
Accuracy ranking of K-fold cross validation doesn't agree with accuracy ranking of individual model
这是我第一次 运行 宁 k 折交叉验证,我对从输出中看到的现象感到困惑。基本上,5 折交叉验证始终为模型 8(Adaboost 分类器)和模型 9(梯度提升分类器)提供最高的准确度分数,如下所示。但是,当我 运行 这些 ML 模型分别使用 20% 的数据集作为测试数据时,根据混淆矩阵和 AUC,模型 7(随机森林分类器)始终在所有 5 个模型中产生最高的准确度。我最初的期望是,如果我单独 运行 那个 ML 模型,那么具有高 k 折交叉验证准确性的 ML 模型也应该 return 高精度。这似乎不是这里的情况。有人可以向我解释为什么我会看到这种差异吗?
这些是我用来训练数据的 ML 模型:
model6 = DecisionTreeClassifier()
model7 = RandomForestClassifier(n_estimators=300)
model8 = AdaBoostClassifier(n_estimators=300)
model9 = GradientBoostingClassifier(n_estimators=300, learning_rate=1.0, max_depth=1, random_state=0)
model10 = KNeighborsClassifier(n_neighbors=5)
这是我的 5 重 CV 和独立 ML 模型的完整代码:
X_train, X_test, Y_train, Y_test = train_test_split(whole_data_input, whole_data_output, test_size=0.2)
X_train.reset_index(inplace=True)
#To remove the index column:
X_train.drop(['index'],axis=1,inplace=True)
X_test.reset_index(inplace=True)
#To remove the index column:
X_test.drop(['index'],axis=1,inplace=True)
Y_train.reset_index(inplace=True)
#To remove the index column:
Y_train.drop(['index'],axis=1,inplace=True)
Y_test.reset_index(inplace=True)
#To remove the index column:
Y_test.drop(['index'],axis=1,inplace=True)
warnings.filterwarnings('ignore')
model6 = DecisionTreeClassifier()
model7 = RandomForestClassifier(n_estimators=300)
model8 = AdaBoostClassifier(n_estimators=300)
model9 = GradientBoostingClassifier(n_estimators=300,
learning_rate=1.0,max_depth=1, random_state=0)
model10 = KNeighborsClassifier(n_neighbors=5)
model6.fit(X_train, Y_train)
model7.fit(X_train, Y_train)
model8.fit(X_train, Y_train)
model9.fit(X_train, Y_train)
model10.fit(X_train, Y_train)
# Perform 5-fold cross validation across different models:
#Here I am calling 'whole_data['label'] instead of the 'whole_data[['label']] I created earlier because cross validation only works with this data shape:
whole_data_output=whole_data['label']
print('THE FOLLOWING OUTPUT REPRESENT ACCURACIES OF 5-FOLD VALIDATIONS FROM VARIOUS ML MODELS:')
print()
scores = cross_val_score(model6, whole_data_input, whole_data_output, cv=5)
print('Cross-validated scores for model6, Decision Tree Classifier, is:' + str(scores))
print()
scores = cross_val_score(model7, whole_data_input, whole_data_output, cv=5)
print('Cross-validated scores for model7, Random Forest Classifier, is:' + str(scores))
print()
scores = cross_val_score(model8, whole_data_input, whole_data_output, cv=5)
print('Cross-validated scores for model8, Adaboost Classifier, is:' + str(scores))
print()
scores = cross_val_score(model9, whole_data_input, whole_data_output, cv=5)
print('Cross-validated scores for model9, Gradient Boosting Classifier, is:' + str(scores))
print()
scores = cross_val_score(model10, whole_data_input, whole_data_output, cv=5)
print('Cross-validated scores for model10, K Neighbors Classifier, is:' + str(scores))
print('THE FOLLOWING OUTPUT REPRESENT RESULTS FROM VARIOUS ML MODELS:')
print()
result6 = model6.predict(X_test)
result7 = model7.predict(X_test)
result8 = model8.predict(X_test)
result9 = model9.predict(X_test)
result10 = model10.predict(X_test)
from sklearn.metrics import classification_report
print('Classification report for model 6, decision tree classifier, is: ')
print(confusion_matrix(Y_test,result6))
print()
print(classification_report(Y_test,result6))
print()
print("Area under curve (auc) of model6 is: ", metrics.roc_auc_score(Y_test, result6))
print()
print('Classification report for model 7, random forest classifier, is: ')
print(confusion_matrix(Y_test,result7))
print()
print(classification_report(Y_test,result7))
print()
print("Area under curve (auc) of model7 is: ", metrics.roc_auc_score(Y_test, result7))
print()
print('Classification report for model 8, adaboost classifier, is: ')
print(confusion_matrix(Y_test,result8))
print()
print(classification_report(Y_test,result8))
print()
print("Area under curve (auc) of model8 is: ", metrics.roc_auc_score(Y_test, result8))
print()
print('Classification report for model 9, gradient boosting classifier, is: ')
print(confusion_matrix(Y_test,result9))
print()
print(classification_report(Y_test,result9))
print()
print("Area under curve (auc) of model9 is: ", metrics.roc_auc_score(Y_test, result9))
print()
print('Classification report for model 10, K neighbors classifier, is: ')
print(confusion_matrix(Y_test,result10))
print()
print(classification_report(Y_test,result10))
print()
print("Area under curve (auc) of model10 is: ", metrics.roc_auc_score(Y_test, result10))
print()
以下输出表示来自各种 ML 模型的 5 折交叉验证的准确性:
Cross-validated scores for model6, Decision Tree Classifier, is:[ 0.61364665 0.75754735 0.77046902]
Cross-validated scores for model7, Random Forest Classifier, is:[ 0.62463637 0.79326395 0.8073181 ]
Cross-validated scores for model8, Adaboost Classifier, is:[ 0.64916931 0.81960696 0.84196916]
Cross-validated scores for model9, Gradient Boosting Classifier, is:[ 0.64910466 0.82177258 0.83909235]
Cross-validated scores for model10, K Neighbors Classifier, is:[ 0.61180425 0.75412115 0.73012897]
以下输出代表来自各种 ML 模型的结果:
Classification report for model 6, decision tree classifier, is:
[[6975 1804]
[1893 7891]]
precision recall f1-score support
-1 0.79 0.79 0.79 8779
1 0.81 0.81 0.81 9784
avg / total 0.80 0.80 0.80 18563
Area under curve (auc) of model6 is: 0.800515237805
Classification report for model 7, random forest classifier, is:
[[6883 1896]
[1216 8568]]
precision recall f1-score support
-1 0.85 0.78 0.82 8779
1 0.82 0.88 0.85 9784
avg / total 0.83 0.83 0.83 18563
Area under curve (auc) of model7 is: 0.829872762782
Classification report for model 8, adaboost classifier, is:
[[5851 2928]
[ 891 8893]]
precision recall f1-score support
-1 0.87 0.67 0.75 8779
1 0.75 0.91 0.82 9784
avg / total 0.81 0.79 0.79 18563
Area under curve (auc) of model8 is: 0.787704885721
Classification report for model 9, gradient boosting classifier, is:
[[5905 2874]
[ 918 8866]]
precision recall f1-score support
-1 0.87 0.67 0.76 8779
1 0.76 0.91 0.82 9784
avg / total 0.81 0.80 0.79 18563
Area under curve (auc) of model9 is: 0.789400603089
Classification report for model 10, K neighbors classifier, is:
[[6467 2312]
[1666 8118]]
precision recall f1-score support
-1 0.80 0.74 0.76 8779
1 0.78 0.83 0.80 9784
avg / total 0.79 0.79 0.79 18563
Area under curve (auc) of model10 is: 0.783183129908
尝试在您的 cross_val_score 中设置 cv=StratifiedKFold(n_splits=5, shuffle=True)
,看看是否有所不同。我的理解是 train_test_split
将在 类 内随机抽样,但 cross_val_score
不会(默认情况下)。
您可以使用 from sklearn.model_selection import StratifiedKFold
导入分层 kfold
这是我第一次 运行 宁 k 折交叉验证,我对从输出中看到的现象感到困惑。基本上,5 折交叉验证始终为模型 8(Adaboost 分类器)和模型 9(梯度提升分类器)提供最高的准确度分数,如下所示。但是,当我 运行 这些 ML 模型分别使用 20% 的数据集作为测试数据时,根据混淆矩阵和 AUC,模型 7(随机森林分类器)始终在所有 5 个模型中产生最高的准确度。我最初的期望是,如果我单独 运行 那个 ML 模型,那么具有高 k 折交叉验证准确性的 ML 模型也应该 return 高精度。这似乎不是这里的情况。有人可以向我解释为什么我会看到这种差异吗?
这些是我用来训练数据的 ML 模型:
model6 = DecisionTreeClassifier()
model7 = RandomForestClassifier(n_estimators=300)
model8 = AdaBoostClassifier(n_estimators=300)
model9 = GradientBoostingClassifier(n_estimators=300, learning_rate=1.0, max_depth=1, random_state=0)
model10 = KNeighborsClassifier(n_neighbors=5)
这是我的 5 重 CV 和独立 ML 模型的完整代码:
X_train, X_test, Y_train, Y_test = train_test_split(whole_data_input, whole_data_output, test_size=0.2)
X_train.reset_index(inplace=True)
#To remove the index column:
X_train.drop(['index'],axis=1,inplace=True)
X_test.reset_index(inplace=True)
#To remove the index column:
X_test.drop(['index'],axis=1,inplace=True)
Y_train.reset_index(inplace=True)
#To remove the index column:
Y_train.drop(['index'],axis=1,inplace=True)
Y_test.reset_index(inplace=True)
#To remove the index column:
Y_test.drop(['index'],axis=1,inplace=True)
warnings.filterwarnings('ignore')
model6 = DecisionTreeClassifier()
model7 = RandomForestClassifier(n_estimators=300)
model8 = AdaBoostClassifier(n_estimators=300)
model9 = GradientBoostingClassifier(n_estimators=300,
learning_rate=1.0,max_depth=1, random_state=0)
model10 = KNeighborsClassifier(n_neighbors=5)
model6.fit(X_train, Y_train)
model7.fit(X_train, Y_train)
model8.fit(X_train, Y_train)
model9.fit(X_train, Y_train)
model10.fit(X_train, Y_train)
# Perform 5-fold cross validation across different models:
#Here I am calling 'whole_data['label'] instead of the 'whole_data[['label']] I created earlier because cross validation only works with this data shape:
whole_data_output=whole_data['label']
print('THE FOLLOWING OUTPUT REPRESENT ACCURACIES OF 5-FOLD VALIDATIONS FROM VARIOUS ML MODELS:')
print()
scores = cross_val_score(model6, whole_data_input, whole_data_output, cv=5)
print('Cross-validated scores for model6, Decision Tree Classifier, is:' + str(scores))
print()
scores = cross_val_score(model7, whole_data_input, whole_data_output, cv=5)
print('Cross-validated scores for model7, Random Forest Classifier, is:' + str(scores))
print()
scores = cross_val_score(model8, whole_data_input, whole_data_output, cv=5)
print('Cross-validated scores for model8, Adaboost Classifier, is:' + str(scores))
print()
scores = cross_val_score(model9, whole_data_input, whole_data_output, cv=5)
print('Cross-validated scores for model9, Gradient Boosting Classifier, is:' + str(scores))
print()
scores = cross_val_score(model10, whole_data_input, whole_data_output, cv=5)
print('Cross-validated scores for model10, K Neighbors Classifier, is:' + str(scores))
print('THE FOLLOWING OUTPUT REPRESENT RESULTS FROM VARIOUS ML MODELS:')
print()
result6 = model6.predict(X_test)
result7 = model7.predict(X_test)
result8 = model8.predict(X_test)
result9 = model9.predict(X_test)
result10 = model10.predict(X_test)
from sklearn.metrics import classification_report
print('Classification report for model 6, decision tree classifier, is: ')
print(confusion_matrix(Y_test,result6))
print()
print(classification_report(Y_test,result6))
print()
print("Area under curve (auc) of model6 is: ", metrics.roc_auc_score(Y_test, result6))
print()
print('Classification report for model 7, random forest classifier, is: ')
print(confusion_matrix(Y_test,result7))
print()
print(classification_report(Y_test,result7))
print()
print("Area under curve (auc) of model7 is: ", metrics.roc_auc_score(Y_test, result7))
print()
print('Classification report for model 8, adaboost classifier, is: ')
print(confusion_matrix(Y_test,result8))
print()
print(classification_report(Y_test,result8))
print()
print("Area under curve (auc) of model8 is: ", metrics.roc_auc_score(Y_test, result8))
print()
print('Classification report for model 9, gradient boosting classifier, is: ')
print(confusion_matrix(Y_test,result9))
print()
print(classification_report(Y_test,result9))
print()
print("Area under curve (auc) of model9 is: ", metrics.roc_auc_score(Y_test, result9))
print()
print('Classification report for model 10, K neighbors classifier, is: ')
print(confusion_matrix(Y_test,result10))
print()
print(classification_report(Y_test,result10))
print()
print("Area under curve (auc) of model10 is: ", metrics.roc_auc_score(Y_test, result10))
print()
以下输出表示来自各种 ML 模型的 5 折交叉验证的准确性:
Cross-validated scores for model6, Decision Tree Classifier, is:[ 0.61364665 0.75754735 0.77046902]
Cross-validated scores for model7, Random Forest Classifier, is:[ 0.62463637 0.79326395 0.8073181 ]
Cross-validated scores for model8, Adaboost Classifier, is:[ 0.64916931 0.81960696 0.84196916]
Cross-validated scores for model9, Gradient Boosting Classifier, is:[ 0.64910466 0.82177258 0.83909235]
Cross-validated scores for model10, K Neighbors Classifier, is:[ 0.61180425 0.75412115 0.73012897]
以下输出代表来自各种 ML 模型的结果:
Classification report for model 6, decision tree classifier, is:
[[6975 1804]
[1893 7891]]
precision recall f1-score support
-1 0.79 0.79 0.79 8779
1 0.81 0.81 0.81 9784
avg / total 0.80 0.80 0.80 18563
Area under curve (auc) of model6 is: 0.800515237805
Classification report for model 7, random forest classifier, is:
[[6883 1896]
[1216 8568]]
precision recall f1-score support
-1 0.85 0.78 0.82 8779
1 0.82 0.88 0.85 9784
avg / total 0.83 0.83 0.83 18563
Area under curve (auc) of model7 is: 0.829872762782
Classification report for model 8, adaboost classifier, is:
[[5851 2928]
[ 891 8893]]
precision recall f1-score support
-1 0.87 0.67 0.75 8779
1 0.75 0.91 0.82 9784
avg / total 0.81 0.79 0.79 18563
Area under curve (auc) of model8 is: 0.787704885721
Classification report for model 9, gradient boosting classifier, is:
[[5905 2874]
[ 918 8866]]
precision recall f1-score support
-1 0.87 0.67 0.76 8779
1 0.76 0.91 0.82 9784
avg / total 0.81 0.80 0.79 18563
Area under curve (auc) of model9 is: 0.789400603089
Classification report for model 10, K neighbors classifier, is:
[[6467 2312]
[1666 8118]]
precision recall f1-score support
-1 0.80 0.74 0.76 8779
1 0.78 0.83 0.80 9784
avg / total 0.79 0.79 0.79 18563
Area under curve (auc) of model10 is: 0.783183129908
尝试在您的 cross_val_score 中设置 cv=StratifiedKFold(n_splits=5, shuffle=True)
,看看是否有所不同。我的理解是 train_test_split
将在 类 内随机抽样,但 cross_val_score
不会(默认情况下)。
您可以使用 from sklearn.model_selection import StratifiedKFold