游侠的变量重要性
Variable importance with ranger
我使用 caret
+ ranger
训练了一个随机森林。
fit <- train(
y ~ x1 + x2
,data = total_set
,method = "ranger"
,trControl = trainControl(method="cv", number = 5, allowParallel = TRUE, verbose = TRUE)
,tuneGrid = expand.grid(mtry = c(4,5,6))
,importance = 'impurity'
)
现在我想看看变量的重要性。但是,none 这些工作:
> importance(fit)
Error in UseMethod("importance") : no applicable method for 'importance' applied to an object of class "c('train', 'train.formula')"
> fit$variable.importance
NULL
> fit$importance
NULL
> fit
Random Forest
217380 samples
32 predictors
No pre-processing
Resampling: Cross-Validated (5 fold)
Summary of sample sizes: 173904, 173904, 173904, 173904, 173904
Resampling results across tuning parameters:
mtry RMSE Rsquared
4 0.03640464 0.5378731
5 0.03645528 0.5366478
6 0.03651451 0.5352838
RMSE was used to select the optimal model using the smallest value.
The final value used for the model was mtry = 4.
知道我是否以及如何获得它吗?
谢谢。
varImp(fit)
会帮你拿的。
为了解决这个问题,我查看了 names(fit)
,这让我找到了 names(fit$modelInfo)
- 然后您会看到 varImp
作为选项之一。
根据@fmalaussena
set.seed(123)
ctrl <- trainControl(method = 'cv',
number = 10,
classProbs = TRUE,
savePredictions = TRUE,
verboseIter = TRUE)
rfFit <- train(Species ~ .,
data = iris,
method = "ranger",
importance = "permutation", #***
trControl = ctrl,
verbose = T)
您可以将 "permutation"
或 "impurity"
传递给参数 importance
。
可以在此处找到这两个值的说明:https://alexisperrier.com/datascience/2015/08/27/feature-importance-random-forests-gini-accuracy.html
对于 'ranger' 包,您可以使用
调用重要性
fit$variable.importance
附带说明一下,您可以使用 str()
查看模型的所有可用输出
str(fit)
我使用 caret
+ ranger
训练了一个随机森林。
fit <- train(
y ~ x1 + x2
,data = total_set
,method = "ranger"
,trControl = trainControl(method="cv", number = 5, allowParallel = TRUE, verbose = TRUE)
,tuneGrid = expand.grid(mtry = c(4,5,6))
,importance = 'impurity'
)
现在我想看看变量的重要性。但是,none 这些工作:
> importance(fit)
Error in UseMethod("importance") : no applicable method for 'importance' applied to an object of class "c('train', 'train.formula')"
> fit$variable.importance
NULL
> fit$importance
NULL
> fit
Random Forest
217380 samples
32 predictors
No pre-processing
Resampling: Cross-Validated (5 fold)
Summary of sample sizes: 173904, 173904, 173904, 173904, 173904
Resampling results across tuning parameters:
mtry RMSE Rsquared
4 0.03640464 0.5378731
5 0.03645528 0.5366478
6 0.03651451 0.5352838
RMSE was used to select the optimal model using the smallest value.
The final value used for the model was mtry = 4.
知道我是否以及如何获得它吗?
谢谢。
varImp(fit)
会帮你拿的。
为了解决这个问题,我查看了 names(fit)
,这让我找到了 names(fit$modelInfo)
- 然后您会看到 varImp
作为选项之一。
根据@fmalaussena
set.seed(123)
ctrl <- trainControl(method = 'cv',
number = 10,
classProbs = TRUE,
savePredictions = TRUE,
verboseIter = TRUE)
rfFit <- train(Species ~ .,
data = iris,
method = "ranger",
importance = "permutation", #***
trControl = ctrl,
verbose = T)
您可以将 "permutation"
或 "impurity"
传递给参数 importance
。
可以在此处找到这两个值的说明:https://alexisperrier.com/datascience/2015/08/27/feature-importance-random-forests-gini-accuracy.html
对于 'ranger' 包,您可以使用
调用重要性fit$variable.importance
附带说明一下,您可以使用 str()
查看模型的所有可用输出str(fit)