如何 运行 使用预选的 Lambda 对 LASSO 进行 10 折交叉验证

How to run a 10-fold cross validation for a LASSO with pre selected Lambda

我通过 运行多次使用 LASSO 并取平均 lambda 选择了一个 lambda,我使用了 glmnet。我知道想要 运行 使用此 Lambda 对此 LASSO 进行 10 折交叉验证。

这是我迄今为止尝试过的代码示例:

library(caret)
library(glmnet)
train.control = trainControl(method = "cv", number = 10)


lm.out = lm(outcome ~ 0 +., data = df)
x = model.matrix(lm.out)
y = df$outcome

model = train(glmnet(x, y, lambda = mean(Lambda_LASSO)),
              data = df, trControl = train.control)

这里Lambda_LASSO是从cv.glmnet.

的迭代运行s中取出的Lambda向量

首先,我不得不说这听起来很奇怪:

I have chosen a lambda by running the LASSO multiple times and taking the mean lambda

取 lambda 值的平均值的目的是什么?

下次提供一个示例数据集,并说明是分类还是回归。假设你的 df 是这样的,我们从 glmnet 得到 lambdas:

df = data.frame(matrix(runif(50*30),ncol=30))
df$outcome = rnorm(50)

x = model.matrix(outcome ~ 0 +., data = df)
y = df$outcome

Lambda_LASSO = glmnet(x,y)$lambda

您可以使用 tuneGrid = 将其输入插入符号并将 alpha 固定为 1,因为您正在使用套索:

train.control = trainControl(method = "cv", number = 10)

model = train(x=x,y=y,
tuneGrid = data.frame(alpha=1,lambda = mean(Lambda_LASSO)),
trControl = train.control,
method = "glmnet")


glmnet 

50 samples
30 predictors

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 43, 46, 46, 45, 46, 45, ... 
Resampling results:

  RMSE      Rsquared   MAE     
  1.519513  0.3486916  1.286363

Tuning parameter 'alpha' was held constant at a value of 1
Tuning
 parameter 'lambda' was held constant at a value of 0.03752899