scikit-learn 的 LassoCV 的评分指标

Question

我正在使用 scikit-learn 的 LassoCV function. During cross-validation, what scoring metric 默认使用？

我希望交叉验证基于 "Mean squared error regression loss"。可以将这一指标与 LassoCV 一起使用吗？可以为 LogisticRegressionCV 指定评分指标，所以 LassoCV 也可以吗？

Answer 1

LassoCV uses R^2 as the scoring metric. From the docs:

By default, parameter search uses the score function of the estimator to evaluate a parameter setting. These are the sklearn.metrics.accuracy_score for classification and sklearn.metrics.r2_score for regression.

要使用其他评分指标，例如均方误差，您需要使用 GridSearchCV or RandomizedSearchCV (instead of LassoCV) and specify the scoring parameter as scoring='neg_mean_squared_error'. From the docs:

An alternative scoring function can be specified via the scoring parameter to GridSearchCV, RandomizedSearchCV and many of the specialized cross-validation tools described below.

Answer 2

我认为接受的答案是错误的，因为它引用了网格搜索的文档，但 LassoCV 使用的是正则化路径，而不是网格搜索。事实上，在 LassoCV 的文档页面中，它说损失函数是：

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

意味着它最小化 MSE（加上 LASSO 项）。

scikit-learn 的 LassoCV 的评分指标

Scoring Metric for scikit-learn's LassoCV

python

lasso-regression

scikit-learn

cross-validation