如何修复插入符包中的错误“`[.data.frame`(数据,,all.vars(术语),drop = FALSE)中的错误:选择了未定义的列”

how to fix Error " Error in `[.data.frame`(data, , all.vars(Terms), drop = FALSE) : undefined columns selected" in caret package

我遇到这个错误:

Error in [.data.frame(data, , all.vars(Terms), drop = FALSE) : undefined columns selected

当我使用 caret 对 3 个不同的集群(索引列)使用 bootstrap 进行回归时

library("tidyverse")
library("lattice")

library("caret")
library("janitor")
data<- read.csv("C:/Users/asus/Desktop/test.csv",header = TRUE)
mydata <- data.frame(index = data$cluster,
                     x     = data[,3:4],
                     y     = data[,5])
tab <- table(mydata$index)
tab
sample_n(mydata, 3)
attach(mydata)
mylist <- list()
mydata <- clean_names(mydata)
head(mydata)
for (i in 1:length(unique(mydata$index))) {
  # define training control
  train.ctrl <- trainControl(method = "boot", number = tab[i])
  # train the model
  mylist[[i]] <- train(mydata[index == i,"y"] ~ mydata[index ==i,"x_xa"] + mydata[index == i,"x_xb"], data = data.frame(mydata), method = "lm",
                       trControl = train.ctrl)
  print(mylist[[i]])
  summary(mylist[[i]])
}

在这里你可以看到我的数据:

您可以在每次迭代中对数据进行子集化并应用相同的公式,因为每个数据集都有相同的列。尝试再次通过 help page。另外,请仅包含相关代码。

假设您的数据是这样的:

set.seed(111)
mydata = data.frame(index = sample(1:3,500,replace=TRUE),
                    x1 = rnorm(500),
                    x2 = rnorm(500),
                     y = runif(500)
                    )

然后是这样的:

library(caret)

tab <- table(mydata$index)
mylist <- list()

for (i in unique(mydata$index)) {

  train.ctrl <- trainControl(method = "boot", number = tab[i])
  mylist[[i]] <- train(y ~ x1 + x2, 
                       data = subset(mydata, index == i),
                       method = "lm",
                       trControl = train.ctrl)
}