插入符号 (R) 中的 summary() 和 print() 有什么区别
What is the difference between the summary() and print() in caret (R)
在 R 的 caret 包中建模的上下文中,summary() 和 print() 函数有什么区别?对于具有 4 个分量 28.52% 或 21.4% 的模型,此处解释的方差到底是多少?
> summary(model)
Data: X dimension: 261 130
Y dimension: 261 1
Fit method: oscorespls
Number of components considered: 4
TRAINING: % variance explained
1 comps 2 comps 3 comps 4 comps
X 90.1526 92.91 94.86 96.10
.outcome 0.8772 17.17 23.99 28.52
对
> print(model)
Partial Least Squares
261 samples
130 predictors
No pre-processing
Resampling: Cross-Validated (5 fold, repeated 50 times)
Summary of sample sizes: 209, 209, 209, 208, 209, 209, ...
Resampling results across tuning parameters:
ncomp RMSE Rsquared MAE
1 5.408986 0.03144022 4.129525
2 5.124799 0.14263362 3.839493
3 4.976591 0.19114791 3.809596
4 4.935419 0.21415260 3.799365
5 5.054086 0.19887704 3.886382
RMSE was used to select the optimal model using the smallest value.
The final value used for the model was ncomp = 4.
有两个组成部分,第一个是你安装/训练的模型类型,因为你使用了偏最小二乘回归,summary(model) returns 你关于插入符号选择的最佳模型的信息。
library(caret)
library(pls)
model = train(mpg ~ .,data=mtcars,
trControl=trainControl(method="cv",number=5),
method="pls")
Partial Least Squares
32 samples
10 predictors
No pre-processing
Resampling: Cross-Validated (5 fold)
Summary of sample sizes: 25, 27, 26, 24, 26
Resampling results across tuning parameters:
ncomp RMSE Rsquared MAE
1 3.086051 0.8252487 2.571524
2 3.129871 0.8122175 2.650973
3 3.014511 0.8582197 2.519962
RMSE was used to select the optimal model using the smallest value.
The final value used for the model was ncomp = 3.
当您执行 print(model)
时,您正在查看训练模型并选择最佳参数的结果。使用 pls,您可以选择组件的数量,这来自 caret
,并且对于其他方法可能看起来相同。在上面,测试了具有 1,2,3 个组件的模型,并选择了具有 3 个组件的模型,因为它具有最小的 RMSE。最终存储的模型是
在 model$finalModel 下,你可以看看它:
class(model$finalModel)
[1] "mvr"
pls:::summary.mvr(model$finalModel)
Data: X dimension: 32 10
Y dimension: 32 1
Fit method: oscorespls
Number of components considered: 3
TRAINING: % variance explained
1 comps 2 comps 3 comps
X 92.73 99.98 99.99
.outcome 74.54 74.84 83.22
从上面可以看出调用的汇总函数来自包 pls 并且特定于此类模型,下面的汇总(模型)为您提供相同的输出:
summary(model)
Data: X dimension: 32 10
Y dimension: 32 1
Fit method: oscorespls
Number of components considered: 3
TRAINING: % variance explained
1 comps 2 comps 3 comps
X 92.73 99.98 99.99
.outcome 74.54 74.84 83.22
partial least sqaure regression 类似于主成分分析,只是分解(或降维)是在 tranpose(X) * Y 上完成的,并且这些成分称为潜在变量。因此,在总结中,您看到的是由潜在变量解释的 X(所有预测变量)和 .outcome(因变量)的方差比例。
在 R 的 caret 包中建模的上下文中,summary() 和 print() 函数有什么区别?对于具有 4 个分量 28.52% 或 21.4% 的模型,此处解释的方差到底是多少?
> summary(model)
Data: X dimension: 261 130
Y dimension: 261 1
Fit method: oscorespls
Number of components considered: 4
TRAINING: % variance explained
1 comps 2 comps 3 comps 4 comps
X 90.1526 92.91 94.86 96.10
.outcome 0.8772 17.17 23.99 28.52
对
> print(model)
Partial Least Squares
261 samples
130 predictors
No pre-processing
Resampling: Cross-Validated (5 fold, repeated 50 times)
Summary of sample sizes: 209, 209, 209, 208, 209, 209, ...
Resampling results across tuning parameters:
ncomp RMSE Rsquared MAE
1 5.408986 0.03144022 4.129525
2 5.124799 0.14263362 3.839493
3 4.976591 0.19114791 3.809596
4 4.935419 0.21415260 3.799365
5 5.054086 0.19887704 3.886382
RMSE was used to select the optimal model using the smallest value.
The final value used for the model was ncomp = 4.
有两个组成部分,第一个是你安装/训练的模型类型,因为你使用了偏最小二乘回归,summary(model) returns 你关于插入符号选择的最佳模型的信息。
library(caret)
library(pls)
model = train(mpg ~ .,data=mtcars,
trControl=trainControl(method="cv",number=5),
method="pls")
Partial Least Squares
32 samples
10 predictors
No pre-processing
Resampling: Cross-Validated (5 fold)
Summary of sample sizes: 25, 27, 26, 24, 26
Resampling results across tuning parameters:
ncomp RMSE Rsquared MAE
1 3.086051 0.8252487 2.571524
2 3.129871 0.8122175 2.650973
3 3.014511 0.8582197 2.519962
RMSE was used to select the optimal model using the smallest value.
The final value used for the model was ncomp = 3.
当您执行 print(model)
时,您正在查看训练模型并选择最佳参数的结果。使用 pls,您可以选择组件的数量,这来自 caret
,并且对于其他方法可能看起来相同。在上面,测试了具有 1,2,3 个组件的模型,并选择了具有 3 个组件的模型,因为它具有最小的 RMSE。最终存储的模型是
在 model$finalModel 下,你可以看看它:
class(model$finalModel)
[1] "mvr"
pls:::summary.mvr(model$finalModel)
Data: X dimension: 32 10
Y dimension: 32 1
Fit method: oscorespls
Number of components considered: 3
TRAINING: % variance explained
1 comps 2 comps 3 comps
X 92.73 99.98 99.99
.outcome 74.54 74.84 83.22
从上面可以看出调用的汇总函数来自包 pls 并且特定于此类模型,下面的汇总(模型)为您提供相同的输出:
summary(model)
Data: X dimension: 32 10
Y dimension: 32 1
Fit method: oscorespls
Number of components considered: 3
TRAINING: % variance explained
1 comps 2 comps 3 comps
X 92.73 99.98 99.99
.outcome 74.54 74.84 83.22
partial least sqaure regression 类似于主成分分析,只是分解(或降维)是在 tranpose(X) * Y 上完成的,并且这些成分称为潜在变量。因此,在总结中,您看到的是由潜在变量解释的 X(所有预测变量)和 .outcome(因变量)的方差比例。