当 'replace = FALSE' 时,R (CARET) 无法获取大于总体的样本

R (CARET) cannot take a sample larger than the population when 'replace = FALSE'

我正在尝试使用 R 中的 CARET 包训练多项式支持向量机,并收到问题标题中所述的错误消息。我在谷歌搜索解决方案时遇到困难,想知道是否有人可以指导我寻找解决方案。

我的代码在下面分享。

Train_CTRL <- trainControl(method = "repeatedcv", 
                           number = 10, repeats = 5)
SVM_Poly <- train(Degree~., 
                  data = Train_Set_Norm, 
                  method = "svmPoly",
                  trControl = Train_CTRL,
                  tuneLength = 1000)

错误:

Error in sample.int(n = 1000000L, size = num_rs * nrow(trainInfo$loop) + : cannot take a sample larger than the population when 'replace = FALSE'

错误很简单。它表示在您的数据集中样本数少于 1000。tuneLength 应该少于数据集中每个 group/class 的样本数。由于您尚未提供数据,我可以使用 iris 数据重现您的错误,例如

library(caret)

Train_CTRL <- trainControl(method = "repeatedcv", 
                           number = 10, repeats = 5)
SVM_Poly <- train(Species~., 
                  data = iris, 
                  method = "svmPoly",
                  trControl = Train_CTRL,
                  tuneLength = 1000)

Error in sample.int(n = 1000000L, size = num_rs * nrow(trainInfo$loop) + : cannot take a sample larger than the population when 'replace = FALSE'

具有 10 次 cross-validation 和 5 次重复的 tuneLength = 1000 将生成 1000 x 10 x 5 = 50000 种计算密集型组合。您可以在 tuneLength 中使用较小的数字,例如

SVM_Poly <- train(Species~., 
                  data = iris, 
                  method = "svmPoly",
                  trControl = Train_CTRL,
                  tuneLength = 10)