运行 elasticnet logistic regression with glmnet in caret package 时无法获得概率预测
Cannot obtain probability predictions when running elasticnet logistic regression with glmnet in caret package
我注意到当 运行 使用 glmnet 包在插入符号中惩罚逻辑回归时,模型预测被重新分类为 0 或 1 个结果:
mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
train_control <- trainControl(method="cv", number=10, savePredictions = TRUE)
glmnetGrid <- expand.grid(alpha=c(0, .5, 1), lambda=c(.1, 1, 10))
model<- train(as.factor(admit) ~ ., data=mydata, trControl=train_control, method="glmnet", family="binomial", tuneGrid=glmnetGrid, metric="Accuracy", preProcess=c("center","scale"))
model
glmnet
400 samples
3 predictor
2 classes: '0', '1'
Pre-processing: centered (3), scaled (3)
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 360, 360, 361, 359, 360, 361, ...
Resampling results across tuning parameters:
alpha lambda Accuracy Kappa Accuracy SD Kappa SD
0.0 0.1 0.6923233271 0.09027099758 0.018975211636 0.06988057154
0.0 1.0 0.6825703565 0.00000000000 0.007557700521 0.00000000000
0.0 10.0 0.6825703565 0.00000000000 0.007557700521 0.00000000000
0.5 0.1 0.6825703565 0.00000000000 0.007557700521 0.00000000000
0.5 1.0 0.6825703565 0.00000000000 0.007557700521 0.00000000000
0.5 10.0 0.6825703565 0.00000000000 0.007557700521 0.00000000000
1.0 0.1 0.6825703565 0.00000000000 0.007557700521 0.00000000000
1.0 1.0 0.6825703565 0.00000000000 0.007557700521 0.00000000000
1.0 10.0 0.6825703565 0.00000000000 0.007557700521 0.00000000000
Accuracy was used to select the optimal model using the largest value.
The final values used for the model were alpha = 0 and lambda = 0.1.
> head(model$pred)
pred obs rowIndex alpha lambda Resample
1 0 0 16 0 10 Fold01
2 0 0 17 0 10 Fold01
3 0 0 24 0 10 Fold01
4 0 1 46 0 10 Fold01
5 0 0 69 0 10 Fold01
6 0 0 84 0 10 Fold01
> summary(model$pred)
pred obs rowIndex alpha lambda Resample
0:3576 0:2457 Min. : 1.00 Min. :0.0 Min. : 0.1 Length:3600
1: 24 1:1143 1st Qu.:100.75 1st Qu.:0.0 1st Qu.: 0.1 Class :character
Median :200.50 Median :0.5 Median : 1.0 Mode :character
Mean :200.50 Mean :0.5 Mean : 3.7
3rd Qu.:300.25 3rd Qu.:1.0 3rd Qu.:10.0
Max. :400.00 Max. :1.0 Max. :10.0
是否有可能获得原始预测概率 = exp(logit(y)) 而不是 0/1 预测结果?
您必须在 trainControl
中使用选项 ClassProbs
。承认因素必须是一个字符,因为这将用作列名。请参阅以下示例。
library(caret)
mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
mydata$admit <- as.factor(mydata$admit)
#create levels yes/no to make sure the the classprobs get a correct name
levels(mydata$admit) = c("yes", "no")
train_control <- trainControl(method="cv", number=10, classProbs = TRUE, savePredictions = TRUE)
glmnetGrid <- expand.grid(alpha=c(0, .5, 1), lambda=c(.1, 1, 10))
set.seed(4242)
model<- train(admit ~ .,
data=mydata,
trControl = train_control,
method="glmnet",
family="binomial",
tuneGrid=glmnetGrid,
metric="Accuracy",
preProcess=c("center","scale"))
head(model$pred)
pred obs rowIndex yes no alpha lambda Resample
1 yes no 4 0.6856383 0.3143617 0 10 Fold01
2 yes no 6 0.6796251 0.3203749 0 10 Fold01
3 yes yes 10 0.6764742 0.3235258 0 10 Fold01
4 yes yes 71 0.6795685 0.3204315 0 10 Fold01
5 yes no 78 0.6774003 0.3225997 0 10 Fold01
6 yes yes 82 0.6812158 0.3187842 0 10 Fold01
我注意到当 运行 使用 glmnet 包在插入符号中惩罚逻辑回归时,模型预测被重新分类为 0 或 1 个结果:
mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
train_control <- trainControl(method="cv", number=10, savePredictions = TRUE)
glmnetGrid <- expand.grid(alpha=c(0, .5, 1), lambda=c(.1, 1, 10))
model<- train(as.factor(admit) ~ ., data=mydata, trControl=train_control, method="glmnet", family="binomial", tuneGrid=glmnetGrid, metric="Accuracy", preProcess=c("center","scale"))
model
glmnet
400 samples
3 predictor
2 classes: '0', '1'
Pre-processing: centered (3), scaled (3)
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 360, 360, 361, 359, 360, 361, ...
Resampling results across tuning parameters:
alpha lambda Accuracy Kappa Accuracy SD Kappa SD
0.0 0.1 0.6923233271 0.09027099758 0.018975211636 0.06988057154
0.0 1.0 0.6825703565 0.00000000000 0.007557700521 0.00000000000
0.0 10.0 0.6825703565 0.00000000000 0.007557700521 0.00000000000
0.5 0.1 0.6825703565 0.00000000000 0.007557700521 0.00000000000
0.5 1.0 0.6825703565 0.00000000000 0.007557700521 0.00000000000
0.5 10.0 0.6825703565 0.00000000000 0.007557700521 0.00000000000
1.0 0.1 0.6825703565 0.00000000000 0.007557700521 0.00000000000
1.0 1.0 0.6825703565 0.00000000000 0.007557700521 0.00000000000
1.0 10.0 0.6825703565 0.00000000000 0.007557700521 0.00000000000
Accuracy was used to select the optimal model using the largest value.
The final values used for the model were alpha = 0 and lambda = 0.1.
> head(model$pred)
pred obs rowIndex alpha lambda Resample
1 0 0 16 0 10 Fold01
2 0 0 17 0 10 Fold01
3 0 0 24 0 10 Fold01
4 0 1 46 0 10 Fold01
5 0 0 69 0 10 Fold01
6 0 0 84 0 10 Fold01
> summary(model$pred)
pred obs rowIndex alpha lambda Resample
0:3576 0:2457 Min. : 1.00 Min. :0.0 Min. : 0.1 Length:3600
1: 24 1:1143 1st Qu.:100.75 1st Qu.:0.0 1st Qu.: 0.1 Class :character
Median :200.50 Median :0.5 Median : 1.0 Mode :character
Mean :200.50 Mean :0.5 Mean : 3.7
3rd Qu.:300.25 3rd Qu.:1.0 3rd Qu.:10.0
Max. :400.00 Max. :1.0 Max. :10.0
是否有可能获得原始预测概率 = exp(logit(y)) 而不是 0/1 预测结果?
您必须在 trainControl
中使用选项 ClassProbs
。承认因素必须是一个字符,因为这将用作列名。请参阅以下示例。
library(caret)
mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
mydata$admit <- as.factor(mydata$admit)
#create levels yes/no to make sure the the classprobs get a correct name
levels(mydata$admit) = c("yes", "no")
train_control <- trainControl(method="cv", number=10, classProbs = TRUE, savePredictions = TRUE)
glmnetGrid <- expand.grid(alpha=c(0, .5, 1), lambda=c(.1, 1, 10))
set.seed(4242)
model<- train(admit ~ .,
data=mydata,
trControl = train_control,
method="glmnet",
family="binomial",
tuneGrid=glmnetGrid,
metric="Accuracy",
preProcess=c("center","scale"))
head(model$pred)
pred obs rowIndex yes no alpha lambda Resample
1 yes no 4 0.6856383 0.3143617 0 10 Fold01
2 yes no 6 0.6796251 0.3203749 0 10 Fold01
3 yes yes 10 0.6764742 0.3235258 0 10 Fold01
4 yes yes 71 0.6795685 0.3204315 0 10 Fold01
5 yes no 78 0.6774003 0.3225997 0 10 Fold01
6 yes yes 82 0.6812158 0.3187842 0 10 Fold01