我不清楚 GridSearchCV 中 best_score_ 的含义

Question

我运行对多个模型进行了实验，并为每个模型生成了最佳分数，以帮助我决定为最终模型选择一个。使用以下代码生成了最佳分数结果：

print(f'Ridge score is {np.sqrt(ridge_grid_search.best_score_ * -1)}')
print(f'Lasso score is {np.sqrt(lasso_grid_search.best_score_ * -1)}')
print(f'ElasticNet score is {np.sqrt(ElasticNet_grid_search.best_score_ * -1)}')
print(f'KRR score is {np.sqrt(KRR_grid_search.best_score_ * -1)}')
print(f'GradientBoosting score is {np.sqrt(gradientBoost_grid_search.best_score_ * -1)}')
print(f'XGBoosting score is {np.sqrt(XGB_grid_search.best_score_ * -1)}')
print(f'LGBoosting score is {np.sqrt(LGB_grid_search.best_score_ * -1)}')

结果张贴在这里：

Ridge score is 0.11353489315048314
Lasso score is 0.11118171778462431
ElasticNet score is 0.11122236468840378
KRR score is 0.11322596291030147
GradientBoosting score is 0.11111049287476948
XGBoosting score is 0.11404604560959673
LGBoosting score is 0.11299104859531962

我不知道如何选择最好的模型。在这种情况下，XGBoosting 是我最好的模型吗？

Answer 1

您的代码未提供，但是来自 ridge_grid_search 的名称，我想您正在使用 sklearn.model_selection.GridSearchCV 进行模型选择。 GridSearch 应用于调整单个模型的超参数，不应用于比较不同模型。 ridge_grid_search.best_score_ returns 在给定算法的网格搜索过程中找到的最佳超参数获得的最佳分数。

对于模型比较，您应该使用交叉验证算法，例如 k-fold cross validation 在使用交叉验证时，请确保每个模型都在相同的 training/testing 集上进行训练和测试以进行公平比较。

我不清楚 GridSearchCV 中 best_score_ 的含义

I am not clear on the meaning of the best_score_ from GridSearchCV

python

pipeline

gridsearchcv