用于计算随机森林 MSE 的嵌套循环

Question

我正在尝试为通过更改 mtry、nodesize 和 ntree 参数创建的多个随机森林计算 MSE。我将这些参数用作 randomForest 函数中的变量，并使用这些变量作为索引创建了 3 "for" 循环。我试图将这些 MSE 变量存储在一维数组中并比较结果。我的问题是在代码的最后一行，我尝试在一个数组中将 729 个 MSE 值彼此相邻添加。如何将它们存储在如下所示的嵌套循环中？

set.seed(425)
toyota_idx =sample(1:nrow(ToyotaCorolla),nrow(ToyotaCorolla)*0.7)
toyota_train = ToyotaCorolla[toyota_idx,]
toyota_test=ToyotaCorolla[-toyota_idx,]

##random forest
forest.mse=rep(0,729)

for (i in 1:9){
  for (j in 1:9){
    for (k in 1:9){
bag.toyota=randomForest(Price~.,data=toyota_train,mtry=i,nodesize=j,ntree=k,importance =TRUE)
toyota.prediction = predict(bag.toyota ,newdata=toyota_test)
forest.mse <- c(forest.mse, mean((toyota.prediction-toyota_test$Price)^2))
    }
  }
}

Answer 1

要得到哪个数组属于哪个 i,j,k 的东西，这将是半疯的。

尝试使用您的 mrty、nodesize 等制作 data.frame 并在 MSE 中每行插入：

set.seed(425)
ToyotaCorolla = data.frame(Price = runif(100),matrix(rnorm(100*10),ncol=10))

toyota_idx =sample(1:nrow(ToyotaCorolla),nrow(ToyotaCorolla)*0.7)
toyota_train = ToyotaCorolla[toyota_idx,]
toyota_test=ToyotaCorolla[-toyota_idx,]

##random forest
forest.mse=rep(0,nrow(toyota_test))
Grid = expand.grid(mtry=1:9,nodesize=1:9,ntree=1:9)
Grid$forest.mse = NA

for(i in 1:nrow(Grid)){

bag.toyota=randomForest(Price~.,data=toyota_train,
mtry=Grid$mtry[i],nodesize=Grid$nodesize[i],ntree=Grid$ntree[i],importance =TRUE)
toyota.prediction = predict(bag.toyota ,newdata=toyota_test)
Grid$forest.mse[i] = mean((toyota.prediction-toyota_test$Price)^2)

}

head(Grid)
  mtry nodesize ntree forest.mse
1    1        1     1  0.1431115
2    2        1     1  0.1652446
3    3        1     1  0.2253738
4    4        1     1  0.1352773
5    5        1     1  0.1561385

用于计算随机森林 MSE 的嵌套循环

Nested Loops for Calculating MSE of Random Forest

loops

r

vector

mse

random-forest