R 不是插入符号函数的有效变量名

R not valid variable name for caret function

我想使用训练插入符号函数来调查 xgboost 结果

#open file with train data
trainy <- read.csv('')
# open file with test data
test <- read.csv('')

# we dont need ID column

##### Removing IDs
trainy$ID <- NULL
test.id <- test$ID
test$ID <- NULL

##### Extracting TARGET
trainy.y <- trainy$TARGET

trainy$TARGET <- NULL


# set up the cross-validated hyper-parameter search
xgb_grid_1 = expand.grid(
  nrounds = 1000,
  eta = c(0.01, 0.001, 0.0001),
  max_depth = c(2, 4, 6, 8, 10),
  gamma = 1
)

# pack the training control parameters
xgb_trcontrol_1 = trainControl(
  method = "cv",
  number = 5,
  verboseIter = TRUE,
  returnData = FALSE,
  returnResamp = "all",                                                        # save losses across all models
  classProbs = TRUE,                                                           # set to TRUE for AUC to be computed
  summaryFunction = twoClassSummary,
  allowParallel = TRUE
)

# train the model for each parameter combination in the grid, 
#   using CV to evaluate
xgb_train_1 = train(
  x = as.matrix(trainy),
  y = as.factor(trainy.y),
  trControl = xgb_trcontrol_1,
  tuneGrid = xgb_grid_1,
  method = "xgbTree"
)

我看到这个错误

Error in train.default(x = as.matrix(trainy), y = as.factor(trainy.y), trControl = xgb_trcontrol_1,  : 
  At least one of the class levels is not a valid R variable name;

我看过其他案例,但还是不明白我应该改变什么? R 现在对我来说 Python 完全不同

如我所见,我应该对 y 类 变量做一些事情,但是究竟是什么以及如何做?为什么 as.factor 功能不起作用?

我解决了这个问题,希望对各位新手有所帮助

我需要像

这样的方式将所有数据转换为因子类型
trainy[] <- lapply(trainy, factor)