如何使用 varImp 函数 select 随机森林的特征?
How to select features for random forest using varImp function?
我将随机森林应用于具有大约 100 个特征的训练数据。现在我想应用特征 selection 技术,以便在对数据应用随机森林模型之前减少特征数量。如何使用 varImp 函数(来自 caret 包)来实现 select 重要功能?我读到 varImp 本身对 select 特征使用了一些分类方法(我发现这非常违反直觉)。我究竟如何应用 varImp 来获取重要的特征子集,然后我可以在应用随机森林分类算法时使用这些特征?
来自 caret
软件包作者 Max Khun feature selection:
Many models that can be accessed using caret's train function produce
prediction equations that do not necessarily use all the predictors.
These models are thought to have built-in feature selection
而rf
就是其中之一。
Many of the functions have an ancillary method called predictors
that
returns a vector indicating which predictors were used in the final
model.
如果您想检索模型中的重要性得分,请在 train()
调用中添加 importance = TRUE
In many cases, using these models with built-in feature selection will
be more efficient than algorithms where the search routine for the
right predictors is external to the model. Built-in feature selection
typically couples the predictor search algorithm with the parameter
estimation and are usually optimized with a single objective function
(e.g. error rates or likelihood).
我将随机森林应用于具有大约 100 个特征的训练数据。现在我想应用特征 selection 技术,以便在对数据应用随机森林模型之前减少特征数量。如何使用 varImp 函数(来自 caret 包)来实现 select 重要功能?我读到 varImp 本身对 select 特征使用了一些分类方法(我发现这非常违反直觉)。我究竟如何应用 varImp 来获取重要的特征子集,然后我可以在应用随机森林分类算法时使用这些特征?
来自 caret
软件包作者 Max Khun feature selection:
Many models that can be accessed using caret's train function produce prediction equations that do not necessarily use all the predictors. These models are thought to have built-in feature selection
而rf
就是其中之一。
Many of the functions have an ancillary method called
predictors
that returns a vector indicating which predictors were used in the final model.
如果您想检索模型中的重要性得分,请在 train()
调用中添加 importance = TRUE
In many cases, using these models with built-in feature selection will be more efficient than algorithms where the search routine for the right predictors is external to the model. Built-in feature selection typically couples the predictor search algorithm with the parameter estimation and are usually optimized with a single objective function (e.g. error rates or likelihood).