如何防止 R 在 ggplot 中按字母顺序排列数据并指定绘制数据的顺序(提供数据 + 代码 + 图表)?

How to prevent R from alphabetically ranking data in ggplot and specify the order in which data is plotted (Data + Code + Graphs provided)?

我正在尝试解决我的 GGBalloonPlot 图表中有关 R 如何处理轴标签的问题。

默认情况下,R 使用按字母顺序倒序排列的标签绘制数据,但要揭示数据的模式,需要按特定顺序绘制数据。我能够欺骗该软件的唯一方法是手动为我的 .csv table 中的每个标签添加前缀,以便 R 能够在我的输出中正确地对它们进行排名。这很耗时,因为我需要先手动排序数据,然后再添加前缀然后绘图。

我想输入一个字符向量(或类似的东西),它基本上指定了我想要绘制数据的顺序,这将揭示模式而不需要在标签名称中添加前缀。

我用 "scale_y_discrete" 做了一些尝试,但没有成功。我也想对 X 轴做同样的事情,因为我必须使用相同的 "trick" 以正确的非字母顺序显示列,这会偏移标签的位置。关于如何让 GGplot 显示我在图表中看到的值而无需 "trick" 软件的任何想法,因为这非常耗时?

数据+代码

#Assign data to "Stack_Overflow_DummyData"

Stack_Overflow_DummyData <- structure(list(Species = structure(c(8L, 3L, 1L, 5L, 6L, 2L, 
                                     7L, 4L, 8L, 3L, 1L, 5L, 6L, 2L, 7L, 4L, 8L, 3L, 1L, 5L, 6L, 2L, 
                                     7L, 4L, 8L, 3L, 1L, 5L, 6L, 2L, 7L, 4L), .Label = c("Ani", "Cal", 
                                                                                         "Can", "Cau", "Fis", "Ort", "Sem", "Zan"), class = "factor"), 
               Species_prefix = structure(c(8L, 7L, 6L, 5L, 4L, 3L, 2L, 
                                            1L, 8L, 7L, 6L, 5L, 4L, 3L, 2L, 1L, 8L, 7L, 6L, 5L, 4L, 3L, 
                                            2L, 1L, 8L, 7L, 6L, 5L, 4L, 3L, 2L, 1L), .Label = c("ac.Cau", 
                                                                                                "ad.Sem", "af.Cal", "ag.Ort", "as.Fis", "at.Ani", "be.Can", 
                                                                                                "bf.Zan"), class = "factor"), Dist = structure(c(2L, 3L, 
                                                                                                                                                 5L, 2L, 1L, 1L, 4L, 5L, 2L, 3L, 5L, 2L, 1L, 1L, 4L, 5L, 2L, 
                                                                                                                                                 3L, 5L, 2L, 1L, 1L, 4L, 5L, 2L, 3L, 5L, 2L, 1L, 1L, 4L, 5L
                                                                                                ), .Label = c("End", "Ind", "Pan", "Per", "Wid"), class = "factor"), 
               Region = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 
                                    4L, 4L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                                    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Cen", "Col", 
                                                                                "Far", "Nor"), class = "factor"), Region_prefix = structure(c(1L, 
                                                                                                                                              1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
                                                                                                                                              3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
                                                                                                                                              4L), .Label = c("a.Far", "b.Nor", "c.Cen", "d.Col"), class = "factor"), 
               Frequency = c(75, 50, 25, 50, 0, 0, 0, 0, 11.1, 22.2, 55.6, 
                             55.6, 11.1, 0, 5.6, 0, 0, 2.7, 36.9, 27.9, 65.8, 54.1, 37.8, 
                             28.8, 0, 0, 0, 3.1, 34.4, 21.9, 78.1, 81.3)), class = "data.frame", row.names = c(NA, 
                                                                                                               -32L))



# Plot Data With Prefix Trick

library(ggplot2)
library(ggpubr)

# make color base on Dist, size and alpha dependent on Frequency
ggballoonplot(Stack_Overflow_DummyData, x = "Region_prefix", y = "Species_prefix", 
              size = "Frequency", size.range = c(1, 9), fill = "Dist") +
  theme_set(theme_gray() + 
  theme(legend.key=element_blank())) + 
  # Sets Grey Theme and removes grey background from legend panel
  theme(axis.title = element_blank()) +
  # Removes X axis title (Region)
  geom_text(aes(label=Frequency), alpha=1.0, size=3, nudge_x = 0.4) 
# Add Frequency Values Next to the circles

# Plot Data Without Prefix Trick

library(ggplot2)
library(ggpubr)

# make color base on Dist, size and alpha dependent on Frequency
ggballoonplot(Stack_Overflow_DummyData, x = "Region", y = "Species", 
              size = "Frequency", size.range = c(1, 9), fill = "Dist") +
  theme_set(theme_gray() + 
  theme(legend.key=element_blank())) + 
  # Sets Grey Theme and removes grey background from legend panel
  theme(axis.title = element_blank()) +
  # Removes X axis title (Region)
  geom_text(aes(label=Frequency), alpha=1.0, size=3, nudge_x = 0.4) 
# Add Frequency Values Next to the circles

下面是图表

好图。

对数据中的可见模式使用标签前缀技巧:

错误的图表(R 默认值)。

当 GGplot 自动排序 data/labels 并且图表没有意义时没有前缀技巧:

总而言之,我想要 Good 图表输出,而不必事先在我的标签中添加前缀。

非常感谢您的帮助。

对于轴标签,我将定义一个先前的函数来覆盖中断:

shlab <- function(lbl_brk){
  sub("^[a-z]+\.","",lbl_brk) # removes the starts of strings as a. or ab.
}

然后,要更改标签,您只需使用 scale_x,y_discretelabels = shlab(如果您查看 scale_x_discrete 的帮助,您会看到 labelsA function that takes the breaks as input and returns labels as output)。

因为颜色足以在 scale_fill_manual 中更改它们 (values),对于尺寸,使用 guides 所以:

library(ggplot2)
library(ggpubr)
shlab <- function(lbl_brk){
  sub("^[a-z]+\.","",lbl_brk)
}
ggballoonplot(Stack_Overflow_DummyData, x = "Region_prefix", y = "Species_prefix", size = "Frequency", size.range = c(1, 9), fill = "Dist") +
  scale_x_discrete(labels = shlab) +
  scale_y_discrete(labels = shlab) +
  scale_fill_manual(values = c("green", "blue", "red", "black", "white")) +
  guides(fill = guide_legend(override.aes = list(size=8))) +
  theme_set(theme_gray() + theme(legend.key=element_blank())) +     # Sets Grey Theme and removes grey background from legend panel
  theme(axis.title = element_blank()) +                             # Removes X axis title (Region)
  geom_text(aes(label=Frequency), alpha=1.0, size=3, nudge_x = 0.4) # Add Frequency Values Next to the circles

更新:

使用新的数据集和矢量标签:

library(ggplot2)
library(ggpubr)

# make color base on Dist, size and alpha dependent on Frequency
ggballoonplot(Stack_Overflow_DummyData, x = "Region", y = "Species", 
              size = "Frequency", size.range = c(1, 9), fill = "Dist") +
  scale_y_discrete(limits = c("Cau", "Sem", "Cal", "Ort", "Fis", "Ani", "Can", "Zan")) +
  scale_x_discrete(limits = c("Far", "Nor", "Cen", "Col")) +
  theme_set(theme_gray() + 
              theme(legend.key=element_blank())) + 
  # Sets Grey Theme and removes grey background from legend panel
  theme(axis.title = element_blank()) +
  # Removes X axis title (Region)
  geom_text(aes(label=Frequency), alpha=1.0, size=3, nudge_x = 0.4)