通过填充计数和图例重新排序来重新排序多列

Reordering multiple columns by fill with count and legend reorder

我在按每个组中的相对计数对数据集中不同组的列重新排序时遇到问题。下面是 tibble 形式的数据集。它有 3 个组,每个组中有不同的设备类型和频率计数:

library(tidyverse) 
library(ggplot2)

dy2 <- tibble(generation = c("All Devices","All Devices","All Devices","All Devices","All Devices","All Devices", 
                      "First Gen", "First Gen","First Gen","First Gen","First Gen","First Gen",
                      "Subsequent Gen","Subsequent Gen","Subsequent Gen","Subsequent Gen","Subsequent Gen"),
       device_type = as.factor(c("Accessories", "Aspiration_catheter", "Guidewire","Microcatheter", "Sheath", "Stentretriever",
                                 "Accessories", "Aspiration_catheter", "Guidewire","Microcatheter", "Sheath", "Stentretriever",
                                 "Accessories", "Aspiration_catheter", "Guidewire", "Sheath", "Stentretriever")),
       N = c(6,36,26,4,18,39,3,20,17,4,8,14,3,16,9,10,25))

当我在 ggplot 中绘制数据集时,我试图通过增加 N 来获得不同组中每种设备类型的顺序,并在上面有一个 geom_text 。我只能让它为第一组(“所有设备”)工作。该图的代码如下:

dy2 %>% 
  ggplot(aes(x= generation, y= N, fill= reorder(device_type,N, function(x){sum(x)}))) +
  geom_bar(position= position_dodge(), alpha= 0.85, stat = "identity")+
  geom_text(data= ~ subset(.x, generation %in% c("All Devices")), position=position_dodge(0.9), aes(y= N+0.8, label= N), size= 3, show_guide= FALSE)+
  geom_text(data= ~ subset(.x, generation %in% c("First Gen")), position=position_dodge(0.9), aes(y= N+0.8, label= N), size= 3, show_guide= FALSE)+ 
  geom_text(data= ~ subset(.x, generation %in% c("Subsequent Gen")), position=position_dodge(0.9), aes(y= N+0.8, label= N), size= 3, show_guide= FALSE)+
  scale_fill_manual(name= NULL,
                    values = c("blue", "black", "red", "green3", "cyan4", "purple"),
                    breaks = c("Accessories", "Aspiration_catheter", "Guidewire",
                               "Microcatheter", "Sheath", "Stentretriever"),
                    labels = c("Accessories", "Aspiration catheter", "Guidewire",
                               "Microcatheter", "Sheath", "Stentretriever")) +
  #scale_x_discrete(breaks= c("All Devices", "First Gen", "Subsequent Gen"),
                 #  labels= c("All<br>Devices", "First<br>Gen", "Sub<br>Gen"))+
  theme_classic()

给出了情节:

如您所见,“First Gen”和“Subsequent Gen”的设备类型列的顺序不正确,而每列上方的N的geom_text在正确的位置但是与关联的列不匹配。

我一直在尝试分解数据集以及不同的重新排序命令,但都无济于事。

这也无法按照“所有设备”组的顺序重新排列填充图例,无论我如何尝试在 scale_fill_manual 中排列中断。

我确定我遗漏了一些保理问题,但我们将不胜感激。

一个选择是使用辅助列

  1. generationN
  2. 排列您的数据
  3. 创建辅助列。我只是将 generationdevice_type 粘贴在一起。
  4. 按照数据集的顺序设置辅助列的级别,例如使用forcats::fct_inorder
  5. group aes
  6. 上映射辅助列
library(dplyr)
library(forcats)
library(ggplot2)

dy2 <- dy2 %>%
  arrange(generation, N) %>%
  mutate(
    device_type2 = paste(generation, device_type, sep = "_"),
    device_type2 = fct_inorder(device_type2)
  )

ggplot(dy2, aes(x = generation, y = N, fill = device_type, group = device_type2)) +
  geom_bar(position = position_dodge(), alpha = 0.85, stat = "identity") +
  geom_text(position = position_dodge(0.9), aes(y = N + 0.8, label = N), size = 3, show.legend = FALSE) +
  scale_fill_manual(
    name = NULL,
    values = c("blue", "black", "red", "green3", "cyan4", "purple"),
    breaks = c(
      "Accessories", "Aspiration_catheter", "Guidewire",
      "Microcatheter", "Sheath", "Stentretriever"
    )
  ) +
  theme_classic()