如何改进 R 中的混淆矩阵?
how can I improve this confusion matrix in R?
使用 R 中的鸢尾花数据集,我编写了一个函数来绘制混淆矩阵。
library(e1071)
library(caTools)
library(caret)
iris$spl = sample.split(iris, SplitRatio = 0.1)
train <- subset(iris, iris$spl == TRUE)
test <- subset(iris, iris$spl == FALSE)
iris.nb <- naiveBayes(Species ~ ., data = train)
nb_train_predict <- predict(iris.nb, test[ , names(test) != "Species"])
cfm <- confusionMatrix(nb_train_predict, test$Species)
cfm
#ggplot confusion matrix
library(ggplot2)
library(scales)
ggplotConfusionMatrix <- function(m){
mytitle <- paste("Accuracy", percent_format()(m$overall[1]),
"Kappa", percent_format()(m$overall[2]))
p <-
ggplot(data = as.data.frame(m$table) ,
aes(x = Reference, y = Prediction)) +
geom_tile(aes(fill = log(Freq)), colour = "white") +
scale_fill_gradient(low = "white", high = "steelblue") +
geom_text(aes(x = Reference, y = Prediction, label = Freq)) +
theme(legend.position = "none") +
ggtitle(mytitle)
return(p)
}
ggplotConfusionMatrix(cfm)
我的问题是:如何从矩阵中删除“0”? (我想要没有任何文字的灰色方块)
您可以为标签创建单独的列。对于 0 频率,将它们设为空白。
library(ggplot2)
library(scales)
ggplotConfusionMatrix <- function(m){
mytitle <- paste("Accuracy", percent_format()(m$overall[1]),
"Kappa", percent_format()(m$overall[2]))
dat <- as.data.frame(m$table)
dat$lab <- ifelse(dat$Freq == 0, '', dat$Freq)
p <-
ggplot(data = dat ,
aes(x = Reference, y = Prediction)) +
geom_tile(aes(fill = log(Freq)), colour = "white") +
scale_fill_gradient(low = "white", high = "steelblue") +
geom_text(aes(x = Reference, y = Prediction, label = lab)) +
theme(legend.position = "none") +
ggtitle(mytitle)
return(p)
}
ggplotConfusionMatrix(cfm)
使用 R 中的鸢尾花数据集,我编写了一个函数来绘制混淆矩阵。
library(e1071)
library(caTools)
library(caret)
iris$spl = sample.split(iris, SplitRatio = 0.1)
train <- subset(iris, iris$spl == TRUE)
test <- subset(iris, iris$spl == FALSE)
iris.nb <- naiveBayes(Species ~ ., data = train)
nb_train_predict <- predict(iris.nb, test[ , names(test) != "Species"])
cfm <- confusionMatrix(nb_train_predict, test$Species)
cfm
#ggplot confusion matrix
library(ggplot2)
library(scales)
ggplotConfusionMatrix <- function(m){
mytitle <- paste("Accuracy", percent_format()(m$overall[1]),
"Kappa", percent_format()(m$overall[2]))
p <-
ggplot(data = as.data.frame(m$table) ,
aes(x = Reference, y = Prediction)) +
geom_tile(aes(fill = log(Freq)), colour = "white") +
scale_fill_gradient(low = "white", high = "steelblue") +
geom_text(aes(x = Reference, y = Prediction, label = Freq)) +
theme(legend.position = "none") +
ggtitle(mytitle)
return(p)
}
ggplotConfusionMatrix(cfm)
我的问题是:如何从矩阵中删除“0”? (我想要没有任何文字的灰色方块)
您可以为标签创建单独的列。对于 0 频率,将它们设为空白。
library(ggplot2)
library(scales)
ggplotConfusionMatrix <- function(m){
mytitle <- paste("Accuracy", percent_format()(m$overall[1]),
"Kappa", percent_format()(m$overall[2]))
dat <- as.data.frame(m$table)
dat$lab <- ifelse(dat$Freq == 0, '', dat$Freq)
p <-
ggplot(data = dat ,
aes(x = Reference, y = Prediction)) +
geom_tile(aes(fill = log(Freq)), colour = "white") +
scale_fill_gradient(low = "white", high = "steelblue") +
geom_text(aes(x = Reference, y = Prediction, label = lab)) +
theme(legend.position = "none") +
ggtitle(mytitle)
return(p)
}
ggplotConfusionMatrix(cfm)