尝试在 R glmnet 中使用 exact=TRUE 功能

trying to use exact=TRUE feature in R glmnet

我正在尝试在 glmnet 中使用 exact=TRUE 功能。但是我收到一条错误消息。

> fit = glmnet(as.matrix(((x_values))), (as.matrix(y_values)),penalty=variable.list$penalty)
> coef.exact = coef(fit, s = 0.03, exact = TRUE)
Error: used coef.glmnet() or predict.glmnet() with `exact=TRUE` so must in addition supply original argument(s)  x and y and penalty.factor  in order to safely rerun glmnet

如何将 penalty.factor 提供给 coef.exact?

尝试过的选项:-

> coef.exact = coef(as.matrix(((x_values))), (as.matrix(y_values)),penalty=variable.list$penalty, s = 0.03, exact = TRUE)
Error: $ operator is invalid for atomic vectors
> 
> coef.exact = coef((as.matrix(((x_values))), (as.matrix(y_values)),penalty=variable.list$penalty), s = 0.03, exact = TRUE)
Error: unexpected ',' in "coef.exact = coef((as.matrix(((x_values))),"
> 
> coef.exact = coef((as.matrix(((x_values))) (as.matrix(y_values)) penalty=variable.list$penalty), s = 0.03, exact = TRUE)
Error: unexpected symbol in "coef.exact = coef((as.matrix(((x_values))) (as.matrix(y_values)) penalty"
> 
> coef.exact = coef(fit(as.matrix(((x_values))), (as.matrix(y_values)),penalty=variable.list$penalty), s = 0.03, exact = TRUE)
Error in fit(as.matrix(((x_values))), (as.matrix(y_values)), penalty = variable.list$penalty) : 
  could not find function "fit"
> 
> coef.exact = coef(glmnet(as.matrix(((x_values))), (as.matrix(y_values)),penalty=variable.list$penalty), s = 0.03, exact = TRUE)
Error: used coef.glmnet() or predict.glmnet() with `exact=TRUE` so must in addition supply original argument(s)  x and y and penalty.factor  in order to safely rerun glmnet
> 

coef中参数s对应惩罚参数。在帮助文件中:

s Value(s) of the penalty parameter lambda at which predictions are required. Default is the entire sequence used to create the model.

[...]

With exact=TRUE, these different values of s are merged (and sorted) with object$lambda, and the model is refit before predictions are made. In this case, it is required to supply the original data x= and y= as additional named arguments to predict() or coef(). The workhorse predict.glmnet() needs to update the model, and so needs the data used to create it. The same is true of weights, offset, penalty.factor, lower.limits, upper.limits if these were used in the original call. Failure to do so will result in an error.

因此,要使用 exact = T,您必须分配原始惩罚、x、y 和您在原始模型中输入的任何其他参数

这里是一个使用 mtcars 作为样本数据的例子。请注意,在 post 进行 SO 时,始终建议提供包含示例数据的 minimal & reproducible code example

# Fit mpg ~ wt + disp
x <- as.matrix(mtcars[c("wt", "disp")]);
y <- mtcars[, "mpg"];
fit <- glmnet(x, y, penalty = 0.1); 

# s is our regularisation parameter, and since we want exact results
# for s=0.035, we need to refit the model using the full data (x,y)
coef.exact <- coef(fit, s = 0.035, exact = TRUE, x = x, y = y, penalty.factor = 0.1);
coef.exact;
#3 x 1 sparse Matrix of class "dgCMatrix"
#                      1
#(Intercept) 34.40289989
#wt          -3.00225110
#disp        -0.02016836

您明确需要再次提供 xy 的原因在 ?coef.glmnet 中给出(另请参阅@FelipeAlvarenga post)。


所以在你的情况下,以下应该有效:

fit = glmnet(x = as.matrix(x_values), y = y_values, penalty=variable.list$penalty)
coef.exact = coef(
    fit, 
    s = 0.03, 
    exact = TRUE, 
    x = as.matrix(x_values), 
    y = y_values, 
    penalty.factor = variable.list$penalty)

一些评论

可能造成混淆的原因是模型的整体正则化参数(s 或 lambda)与可应用于每个系数的 penalty.factor 之间的差异。后者允许对单个参数进行微分正则化,而 s 控制整体 L1/L2 正则化的效果。