我图例上的形状似乎与 plot ggplot2 中显示的顺序不同

Shapes on my legend seem to be in a different order than that shown in plot ggplot2

我在 R 中有一个包含 PCA 数据的数据框,大致如下所示:

obsnames PC1 PC2 PC3
one 2.46 2.57 1.366962e-15
two -3.47 0.84 3.053113e-16
three 1.01 -3.40 7.077672e-16

你可以用这个加载确切的变量:

structure(list(obsnames = c("one", "two", "three"), PC1 = c(2.46310908247957, 
-3.46877162330214, 1.00566254082257), PC2 = c(2.56831624877025, 
0.836571395923965, -3.40488764469422), PC3 = c(1.36696209906972e-15, 
3.05311331771918e-16, 7.07767178198537e-16), `Sample Size` = c(48L, 
74L, 52L)), row.names = c("one", "two", "three"), class = "data.frame")

现在。我试图通过仅使用那些允许“填充”美学(21-25 iirc)的形状来使用 ggplo2 geom_point 绘制此 PCA。但是,我在创建图例时遇到了问题,使其与图中显示的形状和颜色都匹配。我放弃了尝试自己弄清楚,而且我发现这很奇怪,因为我几乎都是手动喂食。这是我的绘图线:

len <- length(pca_data$obsnames)
ggplot(pca_data, aes_string(x=x, y=y)) + 
  geom_point(shape = rep_len(c(21, 22, 23, 24, 25) length.out = 
  len),   
             color = "black", size = 3, aes(fill=obsnames)) + 
  theme_bw() + 
  theme(legend.position="right") + 
  xlab(label_x) + 
  ylab(label_y) + 
  ggtitle(main) + 
  theme(plot.title = element_text(hjust = 0, face="bold")) + 
  geom_hline(aes(0), size=.2,yintercept=0) + 
  geom_vline(aes(0), size=.2,xintercept=0) + 
  coord_equal() + 
  geom_text(data=datapc, aes(x=v1, y=v2, label=varnames), size = 3, vjust=0.3, color="grey", fontface="bold") + 
  geom_segment(data=datapc, aes(x=0, y=0, xend=v1, yend=v2), color="grey", linetype="dotted") + 
  scale_fill_manual(values = rep_len(c("red", "blue", "green", "orange", "yellow", "purple", "pink", "light blue", "white", "black", "gold"), length.out = len)) + 
  guides(fill=guide_legend(override.aes=list(shape=rep_len(c(21, 22, 23, 24, 25), length.out = len))))

输出如下图:

如你所见。图例将“二”显示为绿色菱形,而实际上它应该是绿色 正方形。另外,当我碰巧在我的形状向量中使用与形状相同数量的点(obsnames)时: c(21, 22, 23, 24, 25);也就是5,那么问题就没有出现了。但是我真的不明白我做错了什么...

如果让 ggplot 处理它,这是其中一个效果更好的事情;也就是说,确保将 shape/fill 规范放入 aes() 中。为了这次演示,我稍微削减了你的情节的特点;将它们添加回去应该不会太难。更重要的是,请注意我创建了一个 named 向量作为 scale_*_manual()values 参数传递;这 确保 值和标签将以正确的方式匹配:

len <- length(pca_data$obsnames)
shapes <- rep_len(x = c(21, 22, 23, 24, 25), length.out = len)
ptcols <- rep_len(x = c(
    "red", "blue", "green", "orange", "yellow", "purple", "pink",
    "light blue", "white", "black", "gold"
), length.out = len)
names(shapes) <- pca_data$obsnames
names(ptcols) <- pca_data$obsnames
ggplot(data = pca_data, mapping = aes(x = PC1, y = PC2)) +
    geom_point(aes(shape = obsnames, fill = obsnames), color = "black") +
    scale_fill_manual(values  = ptcols) +
    scale_shape_manual(values = shapes) +
    theme_bw()