如何使用交叉验证来确定多分类 svm 中的成本

Question

嗨，我在 R 中使用以下方法进行多分类。如何使用交叉验证来选择最佳成本？

  n <- nrow(hd) 
  ntrain <- round(n*0.75)
  set.seed(314)
   tindex <- sample(n, ntrain)
   train_iris <- hd[tindex,]
   test_iris <- hd[-tindex,] 
   svm1 <- svm(res~., data=train_iris, 
        method="C-classification", kernal="radial", 
        gamma=0.1, cost=10)

Answer 1

e1071 具有调整 hyper-parameters 值的函数，以便从模型中获得更好的性能。下面的例子演示了如何在e1071中使用tune函数。它在给定范围内找到最好的 cost 参数（cost = 1 给出的最佳错误率为 0.333）

hd <- iris
svm_tune <- tune(svm, Species~., data=hd ,kernel ="radial", 
              ranges = list(cost=c(0.001, 0.01,0.1, 1, 10, 100)))

summary(svm_tune)

# Parameter tuning of ‘svm’:
#   
#   - sampling method: 10-fold cross validation 
# 
# - best parameters:
#   cost
# 1
# 
# - best performance: 0.03333333 
# 
# - Detailed performance results:
#   cost      error dispersion
# 1 1e-03 0.74000000 0.15539674
# 2 1e-02 0.74000000 0.15539674
# 3 1e-01 0.10666667 0.07166451
# 4 1e+00 0.03333333 0.04714045
# 5 1e+01 0.04000000 0.05621827
# 6 1e+02 0.05333333 0.06126244

如何使用交叉验证来确定多分类 svm 中的成本

how use cross validation to determine cost in multi-classification svm

r

svm

cross-validation

multiclass-classification