为什么我使用 Logistic 正则化 glmnet 代码得到 0 和 1 之外的概率?

Why do I get probabilities outside 0 and 1 with my Logistic regularized glmnet code?

library(tidyverse)
library(caret)
library(glmnet)

creditdata <- read_excel("R bestanden/creditdata.xlsx")
df <- as.data.frame(creditdata)
df <- na.omit(df)
df$married <- as.factor(df$married)
df$graduate_school <- as.factor(df$graduate_school)
df$high_school <- as.factor(df$high_school)
df$default_payment_next_month <- as.factor(df$default_payment_next_month)
df$sex <- as.factor(df$sex)
df$single <- as.factor(df$single)
df$university <- as.factor(df$university)
set.seed(123)
training.samples <- df$default_payment_next_month %>% 




createDataPartition(p = 0.8, list = FALSE)
train.data  <- df[training.samples, ]
test.data <- df[-training.samples, ]
x <- model.matrix(default_payment_next_month~., train.data)[,-1]
y <- ifelse(train.data$default_payment_next_month == 1, 1, 0)

cv.lasso <- cv.glmnet(x, y, alpha = 1, family = "binomial")
lasso.model <- glmnet(x, y, alpha = 1, family = "binomial",
                      lambda = cv.lasso$lambda.1se)
x.test <- model.matrix(default_payment_next_month ~., test.data)[,-1]
probabilities <- lasso.model %>% predict(newx = x.test)
predicted.classes <- ifelse(probabilities > 0.5, "1", "0")
observed.classes <- test.data$default_payment_next_month
mean(predicted.classes == observed.classes)

大家好,

我是 R 的新手,我一直在尝试使用本网站 http://www.sthda.com/english/articles/36-classification-methods-essentials/149-penalized-logistic-regression-essentials-in-r-ridge-lasso-and-elastic-net/ 上的确切代码来执行逻辑岭回归。 我的目标是预测客户是否有信用卡违约,我们有一个包含因子变量和数值变量的数据集。问题是我的大部分概率都是负的并且小于-1,所以-2.6、-1.4 等等。有人知道这里出了什么问题吗?

在此先感谢您的帮助!

就像 glm 一样,默认情况下 predict 函数用于 glmnet returns predictions on the scale of the link function,这不是概率。

要获得预测概率,请将 type = "response" 添加到 predict 调用:

probabilities <- lasso.model %>% predict(newx = x.test, type = "response")