监督分类:绘制不同样本量和 k 值的 K-NN 精度
Supervised classification: plotting K-NN accuracy for different sample sizes and k values
希望你们明白很难在通用数据集上复制这样的东西。
基本上我想要做的是执行 K-NN 测试和训练集的两个不同大小的七个不同的 k 值。
我的主要问题是 res 应该是一个向量,存储相同训练集大小的所有精度值,但它每次迭代显示一个值,这不允许我来绘制精度图,因为它们看起来是空的。
你知道如何解决这个问题吗?
数据可直接在 R 上免费获得。
data("Sonar")
#Randomization of the sample
set.seed(123)
random <- sample(rep(1:dim(Sonar)[1]))
Sonar <- Sonar[random,]
head(Sonar)
for (i in c(50,100)){ #train/test set size
sonar.train <- Sonar[1:i,-61]
sonar.train.label <- Sonar[1:i,61]
sonar.test <- Sonar[(1+i) :208,-61]
sonar.test.label <- Sonar[(1+i) :208 ,61]
res <- rep(NA,7)
for (j in c(3,5,7,9,11,13,15)){ #values of k
mod = knn(train= sonar.train, test = sonar.test, cl = sonar.train.label, k = j) #classification for test set
err = sum(sonar.test.label==mod) #accuracy
res[match(j,c(3,5,7,9,11,13,15))] = err/length(mod) #put accuracy value in vector
print(res)
plot(x = c(3,5,7,9,11,13,15) ,y = res, type = "l" ,col = "blue", xlab = "Neighbours", ylab = "Accuracy") #plot the accuracy graphs for each of the two different train/test sets
res <- rep(NA,7)
}
}
#output
>
0.6835443 NA NA NA NA NA NA
NA 0.6582278 NA NA NA NA NA
NA NA 0.6075949 NA NA NA NA
NA NA NA 0.6265823 NA NA NA
NA NA NA NA 0.5949367 NA NA
NA NA NA NA NA 0.5949367 NA
NA NA NA NA NA NA 0.5506329
0.6759259 NA NA NA NA NA NA
NA 0.6111111 NA NA NA NA NA
NA NA 0.5648148 NA NA NA NA
NA NA NA 0.5833333 NA NA NA
NA NA NA NA 0.5925926 NA NA
NA NA NA NA NA 0.5740741 NA
NA NA NA NA NA NA 0.5740741
精度图显示为空,并且 x 轴上的 k 具有不同的标签。
感谢您阅读和帮助我!
您的内部循环应该填充 res
中的值,每次迭代一个。但是,您似乎在循环的每次迭代结束时重置 res
。这就是为什么它不保留任何以前的值。
这两行需要在内循环之外(和外循环之内)
plot(x = c(3,5,7,9,11,13,15) ,y = res, type = "l" ,col = "blue", xlab = "Neighbours", ylab = "Accuracy") #plot the accuracy graphs for each of the two different train/test sets
res <- rep(NA,7)
绘图函数和 res
的重新初始化应该在内循环之外,否则您将 res
重置为每个内循环中的 NA 向量。
新的for循环应该如下
for (i in c(50,100)){ #train/test set size
sonar.train <- Sonar[1:i,-61]
sonar.train.label <- Sonar[1:i,61]
sonar.test <- Sonar[(1+i) :208,-61]
sonar.test.label <- Sonar[(1+i) :208 ,61]
res <- rep(NA,7)
for (j in c(3,5,7,9,11,13,15)){ #values of k
mod = knn(train= sonar.train, test = sonar.test, cl = sonar.train.label, k = j) #classification for test set
err = sum(sonar.test.label==mod) #accuracy
res[match(j,c(3,5,7,9,11,13,15))] = err/length(mod) #put accuracy value in vector
}
plot(x = c(3,5,7,9,11,13,15) ,y = res, type = "l" ,col = "blue", xlab = "Neighbours", ylab = "Accuracy", main = paste("i =", i)) #plot the accuracy graphs for each of the two different train/test sets
res <- rep(NA,7)
}
顺便说一下,我在绘图函数中添加了一个 main = paste("i =", i)
以识别循环指的是哪一次迭代。
编辑
我在写完答案后才意识到@Aziz 抢先了我几秒钟:D
希望你们明白很难在通用数据集上复制这样的东西。
基本上我想要做的是执行 K-NN 测试和训练集的两个不同大小的七个不同的 k 值。
我的主要问题是 res 应该是一个向量,存储相同训练集大小的所有精度值,但它每次迭代显示一个值,这不允许我来绘制精度图,因为它们看起来是空的。
你知道如何解决这个问题吗?
数据可直接在 R 上免费获得。
data("Sonar")
#Randomization of the sample
set.seed(123)
random <- sample(rep(1:dim(Sonar)[1]))
Sonar <- Sonar[random,]
head(Sonar)
for (i in c(50,100)){ #train/test set size
sonar.train <- Sonar[1:i,-61]
sonar.train.label <- Sonar[1:i,61]
sonar.test <- Sonar[(1+i) :208,-61]
sonar.test.label <- Sonar[(1+i) :208 ,61]
res <- rep(NA,7)
for (j in c(3,5,7,9,11,13,15)){ #values of k
mod = knn(train= sonar.train, test = sonar.test, cl = sonar.train.label, k = j) #classification for test set
err = sum(sonar.test.label==mod) #accuracy
res[match(j,c(3,5,7,9,11,13,15))] = err/length(mod) #put accuracy value in vector
print(res)
plot(x = c(3,5,7,9,11,13,15) ,y = res, type = "l" ,col = "blue", xlab = "Neighbours", ylab = "Accuracy") #plot the accuracy graphs for each of the two different train/test sets
res <- rep(NA,7)
}
}
#output
>
0.6835443 NA NA NA NA NA NA
NA 0.6582278 NA NA NA NA NA
NA NA 0.6075949 NA NA NA NA
NA NA NA 0.6265823 NA NA NA
NA NA NA NA 0.5949367 NA NA
NA NA NA NA NA 0.5949367 NA
NA NA NA NA NA NA 0.5506329
0.6759259 NA NA NA NA NA NA
NA 0.6111111 NA NA NA NA NA
NA NA 0.5648148 NA NA NA NA
NA NA NA 0.5833333 NA NA NA
NA NA NA NA 0.5925926 NA NA
NA NA NA NA NA 0.5740741 NA
NA NA NA NA NA NA 0.5740741
精度图显示为空,并且 x 轴上的 k 具有不同的标签。
感谢您阅读和帮助我!
您的内部循环应该填充 res
中的值,每次迭代一个。但是,您似乎在循环的每次迭代结束时重置 res
。这就是为什么它不保留任何以前的值。
这两行需要在内循环之外(和外循环之内)
plot(x = c(3,5,7,9,11,13,15) ,y = res, type = "l" ,col = "blue", xlab = "Neighbours", ylab = "Accuracy") #plot the accuracy graphs for each of the two different train/test sets
res <- rep(NA,7)
绘图函数和 res
的重新初始化应该在内循环之外,否则您将 res
重置为每个内循环中的 NA 向量。
新的for循环应该如下
for (i in c(50,100)){ #train/test set size
sonar.train <- Sonar[1:i,-61]
sonar.train.label <- Sonar[1:i,61]
sonar.test <- Sonar[(1+i) :208,-61]
sonar.test.label <- Sonar[(1+i) :208 ,61]
res <- rep(NA,7)
for (j in c(3,5,7,9,11,13,15)){ #values of k
mod = knn(train= sonar.train, test = sonar.test, cl = sonar.train.label, k = j) #classification for test set
err = sum(sonar.test.label==mod) #accuracy
res[match(j,c(3,5,7,9,11,13,15))] = err/length(mod) #put accuracy value in vector
}
plot(x = c(3,5,7,9,11,13,15) ,y = res, type = "l" ,col = "blue", xlab = "Neighbours", ylab = "Accuracy", main = paste("i =", i)) #plot the accuracy graphs for each of the two different train/test sets
res <- rep(NA,7)
}
顺便说一下,我在绘图函数中添加了一个 main = paste("i =", i)
以识别循环指的是哪一次迭代。
编辑
我在写完答案后才意识到@Aziz 抢先了我几秒钟:D