使用 ggplot 和 geom_rect hack 绘制更好的树状图,但如何使标签正确?
Plotting a better treemap using ggplot and the geom_rect hack, but how to get the labels correct?
我发现使用 ggplot 创建树状图具有挑战性,this blog example 很好地捕捉到了问题并提供了很好的解决方法。解决方法采用树形图包的输出来创建一个带有 geom_rect.
的 ggplot 版本
我的问题是如何调整标签,如果我愿意的话,还可以按层次调整颜色,因为我的组比链接示例多,并且有不同的标签要求。
这是一个可重现的简单示例:
library(tidyverse)
library(treemap)
# Create dummy data
tree_data <- data.frame(
my_segment = c(
rep("seg_a", 5),
rep("seg_b", 6),
rep("seg_c", 7)),
my_class = c(
rep("class_1", 2),
rep("class_2", 2),
rep("class_3", 1),
rep("class_4", 2),
rep("class_5", 2),
rep("class_6", 2),
rep("class_7", 1),
rep("class_8", 3),
rep("class_9", 3)),
my_type = c(
rep("type_1", 7),
rep("type_2", 6),
rep("type_3", 5)),
vals = round(runif(18, min = 20, max = 100), 0)
)
这是示例数据帧的头部:
my_segment my_class my_type vals
1 seg_a class_1 type_1 86
2 seg_a class_1 type_1 41
3 seg_a class_2 type_1 23
4 seg_a class_2 type_1 79
5 seg_a class_3 type_1 33
6 seg_b class_4 type_1 82
7 seg_b class_4 type_1 85
8 seg_b class_5 type_2 40
9 seg_b class_5 type_2 83
10 seg_b class_6 type_2 69
11 seg_b class_6 type_2 98
12 seg_c class_7 type_2 91
13 seg_c class_8 type_2 33
树图包运行良好,但在 RStudio 中产生不可读的输出,我希望能够使用 ggplot 进行更多自定义(类似于链接的文章)
# Run treemap function
tree_p <- treemap(
tree_data,
index = c("my_segment", "my_class", "my_type"),
vColor = "my_segment",
vSize = "vals",
type = "index",
fontsize.labels = c(15, 12, 10),
fontcolor.labels = c("white", "orange", "green"),
fontface.labels = c(2, 1, 1),
bg.labels = 0,
align.labels = list(
c("center", "center"),
c("right", "bottom"),
c("left", "bottom")
),
overlap.labels = 0.5,
inflate.labels = FALSE
)
# Note: unreadable output in Rstudio (too small)
使用此博客中的解决方法,但添加额外的层次结构并想要更改标签是问题所在。
# Create the plot in ggplot using geom_rect
# Get underlying data created from running treemap
tm_plot_data <- tree_p$tm %>%
mutate(x1 = x0 + w,
y1 = y0 + h) %>%
mutate(x = (x0+x1)/2,
y = (y0+y1)/2) %>%
mutate(
primary_group = case_when(
level == 1 ~ 1.5,
level == 2 ~ 0.75,
TRUE ~ 0.5
)
)
# Plot
ggplot(tm_plot_data, aes(xmin = x0, ymin = y0, xmax = x1, ymax = y1)) +
# add fill and borders for groups and subgroups
geom_rect(aes(fill = color, size = primary_group),
show.legend = FALSE,
color = "black",
alpha = 0.3
) +
scale_fill_identity() +
# set thicker lines for group borders
scale_size(range = range(tm_plot_data$primary_group)) +
# add labels
ggfittext::geom_fit_text(aes(label = my_segment), color = "white", min.size = 1) +
ggfittext::geom_fit_text(aes(label = my_class), color = "blue", min.size = 1) +
ggfittext::geom_fit_text(aes(label = my_type), color = "red", min.size = 1) +
# options
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0)) +
theme_void()
所以我的问题是有没有办法创建像树图这样的标签?具体来说,seg_a
、seg_b
和 seg_c
应该只出现一次,并以各自段的区域为中心。我还想移动标签,使它们不重叠
感谢您的帮助和建议!
问题是您使用完整数据集 tm_plot_data
来添加标签。因此,对于每个上层,您都会获得多个标签。要解决此问题,请聚合您的数据集并将这些数据集作为 data
传递给 ggfittext::geom_fit_text
。要处理重叠标签,您可以例如使用 ggfittext::geom_fit_text
的 place
参数将 class 标签移动到左下角,将类型标签移动到右上角。
library(tidyverse)
library(treemap)
set.seed(123)
tm_seg <- tm_plot_data %>%
group_by(my_segment) %>%
summarise(x0 = min(x0), y0 = min(y0), y1 = max(y1), x1 = max(x1)) %>%
ungroup()
tm_class <- tm_plot_data %>%
group_by(my_segment, my_class) %>%
summarise(x0 = min(x0), y0 = min(y0), y1 = max(y1), x1 = max(x1)) %>%
ungroup()
tm_type <- tm_plot_data %>%
group_by(my_segment, my_class, my_type) %>%
summarise(x0 = min(x0), y0 = min(y0), y1 = max(y1), x1 = max(x1)) %>%
ungroup()
# Plot
ggplot(tm_plot_data, aes(xmin = x0, ymin = y0, xmax = x1, ymax = y1)) +
# add fill and borders for groups and subgroups
geom_rect(aes(fill = color, size = primary_group),
show.legend = FALSE,
color = "black",
alpha = 0.3
) +
scale_fill_identity() +
# set thicker lines for group borders
scale_size(range = range(tm_plot_data$primary_group)) +
# add labels
ggfittext::geom_fit_text(data = tm_seg, aes(label = my_segment), color = "white", min.size = 4) +
ggfittext::geom_fit_text(data = tm_class, aes(label = my_class), color = "blue", min.size = 1, place = "bottomleft") +
ggfittext::geom_fit_text(data = tm_type, aes(label = my_type), color = "red", min.size = 1, place = "topright") +
# options
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0)) +
theme_void()
#> Warning: Removed 3 rows containing missing values (geom_fit_text).
#> Warning: Removed 12 rows containing missing values (geom_fit_text).
我发现使用 ggplot 创建树状图具有挑战性,this blog example 很好地捕捉到了问题并提供了很好的解决方法。解决方法采用树形图包的输出来创建一个带有 geom_rect.
的 ggplot 版本我的问题是如何调整标签,如果我愿意的话,还可以按层次调整颜色,因为我的组比链接示例多,并且有不同的标签要求。
这是一个可重现的简单示例:
library(tidyverse)
library(treemap)
# Create dummy data
tree_data <- data.frame(
my_segment = c(
rep("seg_a", 5),
rep("seg_b", 6),
rep("seg_c", 7)),
my_class = c(
rep("class_1", 2),
rep("class_2", 2),
rep("class_3", 1),
rep("class_4", 2),
rep("class_5", 2),
rep("class_6", 2),
rep("class_7", 1),
rep("class_8", 3),
rep("class_9", 3)),
my_type = c(
rep("type_1", 7),
rep("type_2", 6),
rep("type_3", 5)),
vals = round(runif(18, min = 20, max = 100), 0)
)
这是示例数据帧的头部:
my_segment my_class my_type vals
1 seg_a class_1 type_1 86
2 seg_a class_1 type_1 41
3 seg_a class_2 type_1 23
4 seg_a class_2 type_1 79
5 seg_a class_3 type_1 33
6 seg_b class_4 type_1 82
7 seg_b class_4 type_1 85
8 seg_b class_5 type_2 40
9 seg_b class_5 type_2 83
10 seg_b class_6 type_2 69
11 seg_b class_6 type_2 98
12 seg_c class_7 type_2 91
13 seg_c class_8 type_2 33
树图包运行良好,但在 RStudio 中产生不可读的输出,我希望能够使用 ggplot 进行更多自定义(类似于链接的文章)
# Run treemap function
tree_p <- treemap(
tree_data,
index = c("my_segment", "my_class", "my_type"),
vColor = "my_segment",
vSize = "vals",
type = "index",
fontsize.labels = c(15, 12, 10),
fontcolor.labels = c("white", "orange", "green"),
fontface.labels = c(2, 1, 1),
bg.labels = 0,
align.labels = list(
c("center", "center"),
c("right", "bottom"),
c("left", "bottom")
),
overlap.labels = 0.5,
inflate.labels = FALSE
)
# Note: unreadable output in Rstudio (too small)
使用此博客中的解决方法,但添加额外的层次结构并想要更改标签是问题所在。
# Create the plot in ggplot using geom_rect
# Get underlying data created from running treemap
tm_plot_data <- tree_p$tm %>%
mutate(x1 = x0 + w,
y1 = y0 + h) %>%
mutate(x = (x0+x1)/2,
y = (y0+y1)/2) %>%
mutate(
primary_group = case_when(
level == 1 ~ 1.5,
level == 2 ~ 0.75,
TRUE ~ 0.5
)
)
# Plot
ggplot(tm_plot_data, aes(xmin = x0, ymin = y0, xmax = x1, ymax = y1)) +
# add fill and borders for groups and subgroups
geom_rect(aes(fill = color, size = primary_group),
show.legend = FALSE,
color = "black",
alpha = 0.3
) +
scale_fill_identity() +
# set thicker lines for group borders
scale_size(range = range(tm_plot_data$primary_group)) +
# add labels
ggfittext::geom_fit_text(aes(label = my_segment), color = "white", min.size = 1) +
ggfittext::geom_fit_text(aes(label = my_class), color = "blue", min.size = 1) +
ggfittext::geom_fit_text(aes(label = my_type), color = "red", min.size = 1) +
# options
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0)) +
theme_void()
所以我的问题是有没有办法创建像树图这样的标签?具体来说,seg_a
、seg_b
和 seg_c
应该只出现一次,并以各自段的区域为中心。我还想移动标签,使它们不重叠
感谢您的帮助和建议!
问题是您使用完整数据集 tm_plot_data
来添加标签。因此,对于每个上层,您都会获得多个标签。要解决此问题,请聚合您的数据集并将这些数据集作为 data
传递给 ggfittext::geom_fit_text
。要处理重叠标签,您可以例如使用 ggfittext::geom_fit_text
的 place
参数将 class 标签移动到左下角,将类型标签移动到右上角。
library(tidyverse)
library(treemap)
set.seed(123)
tm_seg <- tm_plot_data %>%
group_by(my_segment) %>%
summarise(x0 = min(x0), y0 = min(y0), y1 = max(y1), x1 = max(x1)) %>%
ungroup()
tm_class <- tm_plot_data %>%
group_by(my_segment, my_class) %>%
summarise(x0 = min(x0), y0 = min(y0), y1 = max(y1), x1 = max(x1)) %>%
ungroup()
tm_type <- tm_plot_data %>%
group_by(my_segment, my_class, my_type) %>%
summarise(x0 = min(x0), y0 = min(y0), y1 = max(y1), x1 = max(x1)) %>%
ungroup()
# Plot
ggplot(tm_plot_data, aes(xmin = x0, ymin = y0, xmax = x1, ymax = y1)) +
# add fill and borders for groups and subgroups
geom_rect(aes(fill = color, size = primary_group),
show.legend = FALSE,
color = "black",
alpha = 0.3
) +
scale_fill_identity() +
# set thicker lines for group borders
scale_size(range = range(tm_plot_data$primary_group)) +
# add labels
ggfittext::geom_fit_text(data = tm_seg, aes(label = my_segment), color = "white", min.size = 4) +
ggfittext::geom_fit_text(data = tm_class, aes(label = my_class), color = "blue", min.size = 1, place = "bottomleft") +
ggfittext::geom_fit_text(data = tm_type, aes(label = my_type), color = "red", min.size = 1, place = "topright") +
# options
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0)) +
theme_void()
#> Warning: Removed 3 rows containing missing values (geom_fit_text).
#> Warning: Removed 12 rows containing missing values (geom_fit_text).