R giving back error: subscript out of bounds

R giving back error: subscript out of bounds

我一直在研究 R ISLR College 数据集,我想对训练集执行最佳子集选择,并绘制与每个大小的最佳模型相关联的训练集 MSE。

library(ISLR)
library(leaps)
data(College)
head(College)

#splitting the data into 70/30
subset<- sample(nrow(college)*0.7)
collegetrain<- college[subset,]
collegetest<-college[-subset,]

这是我的代码:

regfit.full <- regsubsets(apps ~ ., data = college.train, nvmax = 20)
train.mat <- model.matrix(apps ~ ., data = college.train, nvmax = 20)
val.errors <- rep(NA, 20)
for (i in 1:20) {
coefi <- coef(regfit.full, id = i)
pred <- train.mat[, names(coefi)] %*% coefi
val.errors[i] <- mean((pred - college.train$y)^2)
}
plot(val.errors, xlab = "Number of predictors", ylab = "Training MSE", pch = 19, type = "b")

数据集的结构如下: 777 个观察值,训练集中有 543 个,测试集中有 234 个 。 有18个变量,其中17个是数字,1个是yes和no的因数(这个不用改)

错误消息 当我 运行 我的代码是: s$which [id, , drop=FALSE] 出错:下标越界

regfit.full <- regsubsets(Apps ~ ., data = collegetrain, nvmax = 20)
train.mat <- model.matrix(Apps ~ ., data = collegetrain, nvmax = 20)

val.errors <- rep(NA, 20)
for (i in 1:17) {
  coefi <- coef(regfit.full, id = i)
  pred <- train.mat[, names(coefi)] %*% coefi
  val.errors[i] <- mean((pred - collegetrain$Apps)^2)
}

plot(val.errors, xlab = "Number of predictors", ylab = "Training MSE", 
     pch = 19, type = "b")