插入符号火车预处理中的相关截止
Correlation cutoff in caret train preProcess
我正在使用 r 中的插入符号包构建 C5.0 模型。
control <- trainControl(method = "repeatedcv",
number = 10,
repeats = 3,
classProbs = TRUE,
sampling = 'smote',
returnResamp="all",
summaryFunction = twoClassSummary)
grid <- expand.grid(.winnow = c(FALSE, TRUE),
.trials = c(1, 5,10,15,20,25,30,40,45,50),
.model= c("tree"),
.splits=c(2,5,10,15,20,25,50))
c5_model <- train(label ~ .,
data = train,
trControl = control,
method = c5info,
tuneGrid = grid,
preProcess = c("center", "scale", "nzv","corr"),
verbose = FALSE)
是否可以将自定义截止点传递给相关的预处理函数 - 比如 0.75 或我想要的任何点?
您可以在trainControl
中指定预处理选项:
library(caret)
library(mlbench) #for the data
data(Sonar)
ctrl <-trainControl(method = "repeatedcv",
number = 10,
repeats = 3,
classProbs = TRUE,
sampling = 'smote',
returnResamp="all",
summaryFunction = twoClassSummary,
preProcOptions = list(cutoff = 0.75)) # all go in this list
一些游侠模型:
grid <- expand.grid(.mtry = c(2,5,10),
.min.node.size = 2,
.splitrule = "gini")
fit_model <- train(Class ~ .,
data = Sonar,
trControl = ctrl,
metric = "ROC",
method = "ranger",
tuneGrid = grid,
preProcess = c("center", "scale", "nzv","corr"),
verbose = FALSE)
fit_model$preProcess
#output
Created from 679 samples and 60 variables
Pre-processing:
- centered (26)
- ignored (0)
- removed (34)
- scaled (26)
使用不同的截止值:
ctrl2 <-trainControl(method = "repeatedcv",
number = 10,
repeats = 3,
classProbs = TRUE,
sampling = 'smote',
returnResamp="all",
summaryFunction = twoClassSummary,
preProcOptions = list(cutoff = 0.6))
fit_model2 <- train(Class ~ .,
data = Sonar,
trControl = ctrl2,
metric = "ROC",
method = "ranger",
tuneGrid = grid,
preProcess = c("center", "scale", "nzv","corr"),
verbose = FALSE)
fit_model2$preProcess
#output
Created from 679 samples and 60 variables
Pre-processing:
- centered (23)
- ignored (0)
- removed (37)
- scaled (23)
删除了更多列
当我们使用 preProcOptions = list(cutoff = 0.95))
fit_model3$preProcess
#output
Created from 679 samples and 60 variables
Pre-processing:
- centered (55)
- ignored (0)
- removed (5)
- scaled (55)
看起来有效。
同样,您可以传递任何其他预处理选项:
?caret::preProcess
检查所有这些
我正在使用 r 中的插入符号包构建 C5.0 模型。
control <- trainControl(method = "repeatedcv",
number = 10,
repeats = 3,
classProbs = TRUE,
sampling = 'smote',
returnResamp="all",
summaryFunction = twoClassSummary)
grid <- expand.grid(.winnow = c(FALSE, TRUE),
.trials = c(1, 5,10,15,20,25,30,40,45,50),
.model= c("tree"),
.splits=c(2,5,10,15,20,25,50))
c5_model <- train(label ~ .,
data = train,
trControl = control,
method = c5info,
tuneGrid = grid,
preProcess = c("center", "scale", "nzv","corr"),
verbose = FALSE)
是否可以将自定义截止点传递给相关的预处理函数 - 比如 0.75 或我想要的任何点?
您可以在trainControl
中指定预处理选项:
library(caret)
library(mlbench) #for the data
data(Sonar)
ctrl <-trainControl(method = "repeatedcv",
number = 10,
repeats = 3,
classProbs = TRUE,
sampling = 'smote',
returnResamp="all",
summaryFunction = twoClassSummary,
preProcOptions = list(cutoff = 0.75)) # all go in this list
一些游侠模型:
grid <- expand.grid(.mtry = c(2,5,10),
.min.node.size = 2,
.splitrule = "gini")
fit_model <- train(Class ~ .,
data = Sonar,
trControl = ctrl,
metric = "ROC",
method = "ranger",
tuneGrid = grid,
preProcess = c("center", "scale", "nzv","corr"),
verbose = FALSE)
fit_model$preProcess
#output
Created from 679 samples and 60 variables
Pre-processing:
- centered (26)
- ignored (0)
- removed (34)
- scaled (26)
使用不同的截止值:
ctrl2 <-trainControl(method = "repeatedcv",
number = 10,
repeats = 3,
classProbs = TRUE,
sampling = 'smote',
returnResamp="all",
summaryFunction = twoClassSummary,
preProcOptions = list(cutoff = 0.6))
fit_model2 <- train(Class ~ .,
data = Sonar,
trControl = ctrl2,
metric = "ROC",
method = "ranger",
tuneGrid = grid,
preProcess = c("center", "scale", "nzv","corr"),
verbose = FALSE)
fit_model2$preProcess
#output
Created from 679 samples and 60 variables
Pre-processing:
- centered (23)
- ignored (0)
- removed (37)
- scaled (23)
删除了更多列
当我们使用 preProcOptions = list(cutoff = 0.95))
fit_model3$preProcess
#output
Created from 679 samples and 60 variables
Pre-processing:
- centered (55)
- ignored (0)
- removed (5)
- scaled (55)
看起来有效。
同样,您可以传递任何其他预处理选项:
?caret::preProcess
检查所有这些