glmnet 没有从 cv.glmnet 收敛到 lambda.min

glmnet not converging for lambda.min from cv.glmnet

我运行一个20倍cv.glmnet套索模型来获得lambda的"optimal"值。但是,当我尝试从 glmnet() 重现结果时,我收到一条错误消息:

Warning messages:
1: from glmnet Fortran code (error code -1); Convergence for 1th lambda
   value not reached after maxit=100000 iterations; solutions for larger 
   lambdas returned 
2: In getcoef(fit, nvars, nx, vnames) :
   an empty model has been returned; probably a convergence issue

我的代码是这样写的:

set.seed(5)
cv.out <- cv.glmnet(x[train,],y[train],family="binomial",nfolds=20,alpha=1,parallel=TRUE)
coef(cv.out)
bestlam <- cv.out$lambda.min
lasso.mod.best <- glmnet(x[train,],y[train],alpha=1,family="binomial",lambda=bestlam)

现在,上面 bestlam 的值是 2.976023e-05 所以这可能是导致问题的原因?这是关于 lambda 值的舍入问题吗?为什么我不能直接从 glmnet() 函数重现结果?如果我在与此值 bestlam 类似的 运行ge 中使用 lambda 值向量,我没有任何问题。

glmnet 在这方面有点棘手 - 你会想要 运行 你最好的模型与一系列 lambdas(例如,设置 nlambda=101),然后当你预测集 s=bestlamexact=FALSE.

您将单个 lambda 传递给您的 glmnet (lambda=bestlab),这是一个很大的禁忌(您正试图仅使用一个 lambda 值来训练模型)。

来自 glmnet 文档 (?glmnet):

lambda: A user supplied lambda sequence. Typical usage is to have the 
program compute its own lambda sequence based on nlambda and 
lambda.min.ratio. Supplying a value of lambda overrides this. WARNING: use 
with care. Do not supply a single value for lambda (for predictions after CV 
use predict() instead). Supply instead a decreasing sequence of lambda 
values. glmnet relies on its warms starts for speed, and its often faster to 
fit a whole path than compute a single fit.