将 3 个模型指标与 ggplot2 进行比较

Question

当从 3 个训练模型绘制 3 个模型指标（RMSE、MAE、Rsquared）到测试集指标时，我试图证明神经网络模型是最好的

训练和测试指标之间的距离最小
它也有足够低的 RSME/MAE 和高的 Rsquared

现在从附图来看不是很明显。此外，指标在不同的尺度上，因为 Rsquared 在 [0,1] 区间内。有没有办法更好地绘制它，最好在同一个图上？

> trn
            model RMSE Rsquared  MAE dataType
1      Linear Reg 9.17     0.51 6.03    train
2      SVM Radial 7.86     0.64 4.86    train
3 Neural Networks 8.55     0.57 5.59    train
> tst
            model RMSE Rsquared  MAE dataType
1      Linear Reg 9.40     0.53 5.95     test
2      SVM Radial 9.16     0.55 5.50     test
3 Neural Networks 8.66     0.60 5.48     test
>

可重现代码：

trn <- structure(list(model = c("Linear Reg", "SVM Radial", "Neural Networks"),
                      RMSE = c(9.17, 7.86, 8.55), Rsquared = c(0.51, 0.64, 0.57),
                      MAE = c(6.03, 4.86, 5.59)),
                 row.names = c(NA, -3L), class = "data.frame")

tst <- structure(list(model = c("Linear Reg", "SVM Radial", "Neural Networks"),
                      RMSE = c(9.4, 9.16, 8.66), Rsquared = c(0.53, 0.55, 0.6),
                      MAE = c(5.95, 5.5, 5.48)),
                 row.names = c(NA, -3L), class = "data.frame")

trn['dataType'] = 'train'
tst['dataType'] = 'test'

long_tbl <- rbind(trn, tst) %>%
  pivot_longer(cols =!c('model', 'dataType'), names_to = 'metric', values_to='value')

ggplot(long_tbl, aes(x=model, y=value, shape = dataType, colour = metric )) + 
  geom_point()

Answer 1

最简单的方法是 + facet_wrap(~metric, scales="free")，但我认为这不能满足您的“在一个情节中”的要求（它在一个情节中 statement 但是三个子图）。如果 gg1 是您的原始图，那么这是一个非常压缩的格式：

print(gg1 
    + facet_wrap(~metric,scale="free_y",ncol=1) 
    + theme_bw() 
    + theme(panel.spacing=grid::unit(0,"lines"),
             strip.background=element_blank(),strip.text.x=element_blank())
)

任何比这更压缩的东西都需要你做出一些关于丢弃哪些信息的决定（例如，你是否愿意将所有指标重新调整为 min=0、max=1，或者差异的大小是否传达信息？

保持条形标签完好无损可能会使图表更易于阅读（用户不必眯着眼睛看图例来弄清楚哪个指标是哪个）；你也可以试试 moving the strip labels to the right edge.

将 3 个模型指标与 ggplot2 进行比较

compare 3 model metrics with ggplot2

r

ggplot2

cross-validation