如何按类别包装图形,同时在 R 中使用 ggplot 保持相同的条宽?

How to wrap graphs by categories while keeping the same width of bars with ggplot in R?

我正在努力使用 facet_grid()facet wrap()ggplot()。我希望能够为每两个类别(此处为变量 Department)包装不同的堆叠条形图,但同时具有相同宽度的条形图。第一个动作可以用 facet wrap() 实现,而第二个动作可以用 facet_grid() 实现。我想结合这两个功能的优点。请问您知道如何解决这个问题吗?

数据为:

ID<-c("001","002","003","004","005","006","007","008","009","010","NA","012","013")
Name<-c("Damon Bell","Royce Sellers",NA,"Cali Wall","Alan Marshall","Amari Santos","Evelyn Frye","Kierra Osborne","Mohammed Jenkins","Kara Beltran","Davon Harmon","Kaitlin Hammond","Jovany Newman")
Sex<-c("Male","Male","Male",NA,"Male","Male",NA,"Female","Male","Female","Male","Female","Male")
Age<-c(33,27,29,26,27,35,29,32,NA,25,34,29,26)
UKCountry<-c("Scotland","Wales","Scotland","Wales","Northern Ireland","Wales","Northern Ireland","Scotland","England","Northern Ireland","England","England","Wales")
Department<-c("Sports and travel","Sports and travel","Sports and travel","Health and Beauty Care","Sports and travel","Home and lifestyle","Sports and travel","Fashion accessories","Electronic accessories","Electronic accessories","Health and Beauty Care","Electronic accessories",NA)

密码是:

data<-data.frame(ID,Name,Sex,Age,UKCountry,Department)

## Frequency Table
dDepartmentSexUKCountry <- data %>% 
  filter(!is.na(Department) & !is.na(Sex) & !is.na(UKCountry)) %>%
  group_by(Department,Sex,UKCountry) %>% 
  summarise(Count = n()) %>%
  mutate(Total = sum(Count), Percentage = round(Count/Total,3)) 

## Graph
dSexDepartmentUKCountry %>% 
  ggplot(aes(x=Sex,
             y=Percentage,
             fill=UKCountry)) + 
  geom_bar(stat="identity",
           position="fill") + 
  geom_text(aes(label = paste0(round(Percentage*100,0),"%\n(", Count, ")")), 
            position=position_fill(vjust=0.5), color="white") + 
  theme(axis.ticks.x = element_blank(), 
       axis.text.x = element_text(angle = 45,hjust = 1)) + 
  #facet_grid(cols = vars(Department),scales = "free", space = "free")
  facet_wrap(. ~ Department, scales = "free", ncol = 2)

使用 facet_wrap() 时,我得到:

使用 facet_grid() 时,我得到:

理想情况下,我希望(在 Paint 上编辑):

我研究了我的问题,通常我会找到一个或另一个解决方案,但不会找到两者的组合。

以下是否可以接受?

我通过从 facet_wrap() 中删除 scales = "free" 得到这个。列的宽度相同。您可能不希望公开 space,其中一种性别确实有该部门的任何数据。但是,我认为这更容易阅读,因为类别轴标签在每个图上都位于相同的位置(左边是女性,右边是男性),并且这个图清楚地表明有些部门的女性或男性顾客没有购买。

代码如下:

dDepartmentSexUKCountry %>% 
  ggplot(aes(x=Sex,
             y=Percentage,
             fill=UKCountry)) + 
  geom_bar(stat="identity",
           position="fill") + 
  geom_text(aes(label = paste0(round(Percentage*100,0),"%\n(", Count, ")")), 
            position=position_fill(vjust=0.5), color="white") + 
  theme(axis.ticks.x = element_blank(), 
        axis.text.x = element_text(angle = 45,hjust = 1)) + 
  facet_wrap(. ~ Department,  ncol = 2)

这是一种将数据拆分为一定数量的行并使用 patchwork 组装绘图网格的方法。这是必要的,因为 facet_grid 不会沿同一维度以多种方式分解数据,即它不会沿 x-axis 将数据分解成组,但也会将它们包装成多行,并且 facet_wrap 没有自由间距的灵活性。对于一些小的东西来说,这肯定比它的价值更复杂,但这是我用于图形的过程,需要将一堆信息放在一起发布。看你的情况。

这里的基本思想是将要变成条形的部分分成 2 行面板。这有点棘手,因为每个栏都是部门和性别的组合(因此使用 interaction),并且您不是按观察次数拆分,而是按唯一标识符拆分。你可以用不同的方式来做到这一点,但对我来说有意义的方式是使用 rleid 来获取组号,然后根据你需要的行数来缩放它。之后,拆分并为将成为每一行的内容制作相同类型的图。您需要将国家/地区作为一个因素,并且填充比例不会降低缺失因素水平,这样您就可以确保所有地块都具有相同的图例。

rows <- 2

# only difference between data here & OP is I ungrouped it
dept_ids <- dDepartmentSexUKCountry %>%
  mutate(UKCountry = as.factor(UKCountry),
         id = data.table::rleid(interaction(Department, Sex)),
         row = ceiling(id / max(id) * rows))
dept_ids
#> # A tibble: 9 × 8
#>   Department             Sex    UKCountry     Count Total Percentage    id   row
#>   <chr>                  <chr>  <fct>         <int> <int>      <dbl> <int> <dbl>
#> 1 Electronic accessories Female England           1     2       0.5      1     1
#> 2 Electronic accessories Female Northern Ire…     1     2       0.5      1     1
#> 3 Electronic accessories Male   England           1     1       1        2     1
#> 4 Fashion accessories    Female Scotland          1     1       1        3     1
#> 5 Health and Beauty Care Male   England           1     1       1        4     2

plots <- dept_ids %>%
  split(.$row) %>%
  purrr::map(function(df) {
    ggplot(df, aes(x=Sex,
               y=Percentage,
               fill=UKCountry)) + 
      geom_bar(stat="identity",
               position="fill") + 
      geom_text(aes(label = paste0(round(Percentage*100,0),"%\n(", Count, ")")), 
                position=position_fill(vjust=0.5), color="white") + 
      theme(axis.ticks.x = element_blank()) + 
      facet_grid(cols = vars(Department),scales = "free", space = "free") +
      scale_fill_discrete(drop = FALSE)
  })

patchwork::wrap_plots(plots, nrow = rows, guides = "collect")

一个问题是您有重复的 x-axis 标题。由于本例中的标题很漂亮 self-explanatory,您可以完全放弃它,或者您可以在所有情节的主题中将其关闭,将它们拼凑在一起,然后在最后一个情节中重新打开它。进入 patchwork 的汇编函数的最后一行是接收主题设置的内容。

plots %>%
  purrr::map(~. + theme(axis.title.x = element_blank())) %>%
  patchwork::wrap_plots(nrow = rows, guides = "collect") +
  theme(axis.title.x = element_text())

就像我说的,在很多情况下,这会比它值得做的工作更多,但我试图让它足够灵活,以便在它确实有意义的 larger-scale 项目中使用。