如何按类别包装图形,同时在 R 中使用 ggplot 保持相同的条宽?
How to wrap graphs by categories while keeping the same width of bars with ggplot in R?
我正在努力使用 facet_grid()
和 facet wrap()
与 ggplot()
。我希望能够为每两个类别(此处为变量 Department
)包装不同的堆叠条形图,但同时具有相同宽度的条形图。第一个动作可以用 facet wrap()
实现,而第二个动作可以用 facet_grid()
实现。我想结合这两个功能的优点。请问您知道如何解决这个问题吗?
数据为:
ID<-c("001","002","003","004","005","006","007","008","009","010","NA","012","013")
Name<-c("Damon Bell","Royce Sellers",NA,"Cali Wall","Alan Marshall","Amari Santos","Evelyn Frye","Kierra Osborne","Mohammed Jenkins","Kara Beltran","Davon Harmon","Kaitlin Hammond","Jovany Newman")
Sex<-c("Male","Male","Male",NA,"Male","Male",NA,"Female","Male","Female","Male","Female","Male")
Age<-c(33,27,29,26,27,35,29,32,NA,25,34,29,26)
UKCountry<-c("Scotland","Wales","Scotland","Wales","Northern Ireland","Wales","Northern Ireland","Scotland","England","Northern Ireland","England","England","Wales")
Department<-c("Sports and travel","Sports and travel","Sports and travel","Health and Beauty Care","Sports and travel","Home and lifestyle","Sports and travel","Fashion accessories","Electronic accessories","Electronic accessories","Health and Beauty Care","Electronic accessories",NA)
密码是:
data<-data.frame(ID,Name,Sex,Age,UKCountry,Department)
## Frequency Table
dDepartmentSexUKCountry <- data %>%
filter(!is.na(Department) & !is.na(Sex) & !is.na(UKCountry)) %>%
group_by(Department,Sex,UKCountry) %>%
summarise(Count = n()) %>%
mutate(Total = sum(Count), Percentage = round(Count/Total,3))
## Graph
dSexDepartmentUKCountry %>%
ggplot(aes(x=Sex,
y=Percentage,
fill=UKCountry)) +
geom_bar(stat="identity",
position="fill") +
geom_text(aes(label = paste0(round(Percentage*100,0),"%\n(", Count, ")")),
position=position_fill(vjust=0.5), color="white") +
theme(axis.ticks.x = element_blank(),
axis.text.x = element_text(angle = 45,hjust = 1)) +
#facet_grid(cols = vars(Department),scales = "free", space = "free")
facet_wrap(. ~ Department, scales = "free", ncol = 2)
使用 facet_wrap()
时,我得到:
使用 facet_grid()
时,我得到:
理想情况下,我希望(在 Paint 上编辑):
我研究了我的问题,通常我会找到一个或另一个解决方案,但不会找到两者的组合。
以下是否可以接受?
我通过从 facet_wrap()
中删除 scales = "free"
得到这个。列的宽度相同。您可能不希望公开 space,其中一种性别确实有该部门的任何数据。但是,我认为这更容易阅读,因为类别轴标签在每个图上都位于相同的位置(左边是女性,右边是男性),并且这个图清楚地表明有些部门的女性或男性顾客没有购买。
代码如下:
dDepartmentSexUKCountry %>%
ggplot(aes(x=Sex,
y=Percentage,
fill=UKCountry)) +
geom_bar(stat="identity",
position="fill") +
geom_text(aes(label = paste0(round(Percentage*100,0),"%\n(", Count, ")")),
position=position_fill(vjust=0.5), color="white") +
theme(axis.ticks.x = element_blank(),
axis.text.x = element_text(angle = 45,hjust = 1)) +
facet_wrap(. ~ Department, ncol = 2)
这是一种将数据拆分为一定数量的行并使用 patchwork
组装绘图网格的方法。这是必要的,因为 facet_grid
不会沿同一维度以多种方式分解数据,即它不会沿 x-axis 将数据分解成组,但也会将它们包装成多行,并且 facet_wrap
没有自由间距的灵活性。对于一些小的东西来说,这肯定比它的价值更复杂,但这是我用于图形的过程,需要将一堆信息放在一起发布。看你的情况。
这里的基本思想是将要变成条形的部分分成 2 行面板。这有点棘手,因为每个栏都是部门和性别的组合(因此使用 interaction
),并且您不是按观察次数拆分,而是按唯一标识符拆分。你可以用不同的方式来做到这一点,但对我来说有意义的方式是使用 rleid
来获取组号,然后根据你需要的行数来缩放它。之后,拆分并为将成为每一行的内容制作相同类型的图。您需要将国家/地区作为一个因素,并且填充比例不会降低缺失因素水平,这样您就可以确保所有地块都具有相同的图例。
rows <- 2
# only difference between data here & OP is I ungrouped it
dept_ids <- dDepartmentSexUKCountry %>%
mutate(UKCountry = as.factor(UKCountry),
id = data.table::rleid(interaction(Department, Sex)),
row = ceiling(id / max(id) * rows))
dept_ids
#> # A tibble: 9 × 8
#> Department Sex UKCountry Count Total Percentage id row
#> <chr> <chr> <fct> <int> <int> <dbl> <int> <dbl>
#> 1 Electronic accessories Female England 1 2 0.5 1 1
#> 2 Electronic accessories Female Northern Ire… 1 2 0.5 1 1
#> 3 Electronic accessories Male England 1 1 1 2 1
#> 4 Fashion accessories Female Scotland 1 1 1 3 1
#> 5 Health and Beauty Care Male England 1 1 1 4 2
plots <- dept_ids %>%
split(.$row) %>%
purrr::map(function(df) {
ggplot(df, aes(x=Sex,
y=Percentage,
fill=UKCountry)) +
geom_bar(stat="identity",
position="fill") +
geom_text(aes(label = paste0(round(Percentage*100,0),"%\n(", Count, ")")),
position=position_fill(vjust=0.5), color="white") +
theme(axis.ticks.x = element_blank()) +
facet_grid(cols = vars(Department),scales = "free", space = "free") +
scale_fill_discrete(drop = FALSE)
})
patchwork::wrap_plots(plots, nrow = rows, guides = "collect")
一个问题是您有重复的 x-axis 标题。由于本例中的标题很漂亮 self-explanatory,您可以完全放弃它,或者您可以在所有情节的主题中将其关闭,将它们拼凑在一起,然后在最后一个情节中重新打开它。进入 patchwork
的汇编函数的最后一行是接收主题设置的内容。
plots %>%
purrr::map(~. + theme(axis.title.x = element_blank())) %>%
patchwork::wrap_plots(nrow = rows, guides = "collect") +
theme(axis.title.x = element_text())
就像我说的,在很多情况下,这会比它值得做的工作更多,但我试图让它足够灵活,以便在它确实有意义的 larger-scale 项目中使用。
我正在努力使用 facet_grid()
和 facet wrap()
与 ggplot()
。我希望能够为每两个类别(此处为变量 Department
)包装不同的堆叠条形图,但同时具有相同宽度的条形图。第一个动作可以用 facet wrap()
实现,而第二个动作可以用 facet_grid()
实现。我想结合这两个功能的优点。请问您知道如何解决这个问题吗?
数据为:
ID<-c("001","002","003","004","005","006","007","008","009","010","NA","012","013")
Name<-c("Damon Bell","Royce Sellers",NA,"Cali Wall","Alan Marshall","Amari Santos","Evelyn Frye","Kierra Osborne","Mohammed Jenkins","Kara Beltran","Davon Harmon","Kaitlin Hammond","Jovany Newman")
Sex<-c("Male","Male","Male",NA,"Male","Male",NA,"Female","Male","Female","Male","Female","Male")
Age<-c(33,27,29,26,27,35,29,32,NA,25,34,29,26)
UKCountry<-c("Scotland","Wales","Scotland","Wales","Northern Ireland","Wales","Northern Ireland","Scotland","England","Northern Ireland","England","England","Wales")
Department<-c("Sports and travel","Sports and travel","Sports and travel","Health and Beauty Care","Sports and travel","Home and lifestyle","Sports and travel","Fashion accessories","Electronic accessories","Electronic accessories","Health and Beauty Care","Electronic accessories",NA)
密码是:
data<-data.frame(ID,Name,Sex,Age,UKCountry,Department)
## Frequency Table
dDepartmentSexUKCountry <- data %>%
filter(!is.na(Department) & !is.na(Sex) & !is.na(UKCountry)) %>%
group_by(Department,Sex,UKCountry) %>%
summarise(Count = n()) %>%
mutate(Total = sum(Count), Percentage = round(Count/Total,3))
## Graph
dSexDepartmentUKCountry %>%
ggplot(aes(x=Sex,
y=Percentage,
fill=UKCountry)) +
geom_bar(stat="identity",
position="fill") +
geom_text(aes(label = paste0(round(Percentage*100,0),"%\n(", Count, ")")),
position=position_fill(vjust=0.5), color="white") +
theme(axis.ticks.x = element_blank(),
axis.text.x = element_text(angle = 45,hjust = 1)) +
#facet_grid(cols = vars(Department),scales = "free", space = "free")
facet_wrap(. ~ Department, scales = "free", ncol = 2)
使用 facet_wrap()
时,我得到:
使用 facet_grid()
时,我得到:
理想情况下,我希望(在 Paint 上编辑):
我研究了我的问题,通常我会找到一个或另一个解决方案,但不会找到两者的组合。
以下是否可以接受?
我通过从 facet_wrap()
中删除 scales = "free"
得到这个。列的宽度相同。您可能不希望公开 space,其中一种性别确实有该部门的任何数据。但是,我认为这更容易阅读,因为类别轴标签在每个图上都位于相同的位置(左边是女性,右边是男性),并且这个图清楚地表明有些部门的女性或男性顾客没有购买。
代码如下:
dDepartmentSexUKCountry %>%
ggplot(aes(x=Sex,
y=Percentage,
fill=UKCountry)) +
geom_bar(stat="identity",
position="fill") +
geom_text(aes(label = paste0(round(Percentage*100,0),"%\n(", Count, ")")),
position=position_fill(vjust=0.5), color="white") +
theme(axis.ticks.x = element_blank(),
axis.text.x = element_text(angle = 45,hjust = 1)) +
facet_wrap(. ~ Department, ncol = 2)
这是一种将数据拆分为一定数量的行并使用 patchwork
组装绘图网格的方法。这是必要的,因为 facet_grid
不会沿同一维度以多种方式分解数据,即它不会沿 x-axis 将数据分解成组,但也会将它们包装成多行,并且 facet_wrap
没有自由间距的灵活性。对于一些小的东西来说,这肯定比它的价值更复杂,但这是我用于图形的过程,需要将一堆信息放在一起发布。看你的情况。
这里的基本思想是将要变成条形的部分分成 2 行面板。这有点棘手,因为每个栏都是部门和性别的组合(因此使用 interaction
),并且您不是按观察次数拆分,而是按唯一标识符拆分。你可以用不同的方式来做到这一点,但对我来说有意义的方式是使用 rleid
来获取组号,然后根据你需要的行数来缩放它。之后,拆分并为将成为每一行的内容制作相同类型的图。您需要将国家/地区作为一个因素,并且填充比例不会降低缺失因素水平,这样您就可以确保所有地块都具有相同的图例。
rows <- 2
# only difference between data here & OP is I ungrouped it
dept_ids <- dDepartmentSexUKCountry %>%
mutate(UKCountry = as.factor(UKCountry),
id = data.table::rleid(interaction(Department, Sex)),
row = ceiling(id / max(id) * rows))
dept_ids
#> # A tibble: 9 × 8
#> Department Sex UKCountry Count Total Percentage id row
#> <chr> <chr> <fct> <int> <int> <dbl> <int> <dbl>
#> 1 Electronic accessories Female England 1 2 0.5 1 1
#> 2 Electronic accessories Female Northern Ire… 1 2 0.5 1 1
#> 3 Electronic accessories Male England 1 1 1 2 1
#> 4 Fashion accessories Female Scotland 1 1 1 3 1
#> 5 Health and Beauty Care Male England 1 1 1 4 2
plots <- dept_ids %>%
split(.$row) %>%
purrr::map(function(df) {
ggplot(df, aes(x=Sex,
y=Percentage,
fill=UKCountry)) +
geom_bar(stat="identity",
position="fill") +
geom_text(aes(label = paste0(round(Percentage*100,0),"%\n(", Count, ")")),
position=position_fill(vjust=0.5), color="white") +
theme(axis.ticks.x = element_blank()) +
facet_grid(cols = vars(Department),scales = "free", space = "free") +
scale_fill_discrete(drop = FALSE)
})
patchwork::wrap_plots(plots, nrow = rows, guides = "collect")
一个问题是您有重复的 x-axis 标题。由于本例中的标题很漂亮 self-explanatory,您可以完全放弃它,或者您可以在所有情节的主题中将其关闭,将它们拼凑在一起,然后在最后一个情节中重新打开它。进入 patchwork
的汇编函数的最后一行是接收主题设置的内容。
plots %>%
purrr::map(~. + theme(axis.title.x = element_blank())) %>%
patchwork::wrap_plots(nrow = rows, guides = "collect") +
theme(axis.title.x = element_text())
就像我说的,在很多情况下,这会比它值得做的工作更多,但我试图让它足够灵活,以便在它确实有意义的 larger-scale 项目中使用。