使用 ggplot 和 geom_rect hack 绘制更好的树状图,但如何使标签正确?

Plotting a better treemap using ggplot and the geom_rect hack, but how to get the labels correct?

我发现使用 ggplot 创建树状图具有挑战性,this blog example 很好地捕捉到了问题并提供了很好的解决方法。解决方法采用树形图包的输出来创建一个带有 geom_rect.

的 ggplot 版本

我的问题是如何调整标签,如果我愿意的话,还可以按层次调整颜色,因为我的组比链接示例多,并且有不同的标签要求。

这是一个可重现的简单示例:

library(tidyverse)
library(treemap)


# Create dummy data
tree_data <- data.frame(
  my_segment = c(
    rep("seg_a", 5),
    rep("seg_b", 6),
    rep("seg_c", 7)),
  my_class = c(
    rep("class_1", 2),
    rep("class_2", 2),
    rep("class_3", 1),
    rep("class_4", 2),
    rep("class_5", 2),
    rep("class_6", 2),
    rep("class_7", 1),
    rep("class_8", 3),
    rep("class_9", 3)),
  my_type = c(
    rep("type_1", 7),
    rep("type_2", 6),
    rep("type_3", 5)),
  vals = round(runif(18, min = 20, max = 100), 0)
)

这是示例数据帧的头部:

   my_segment my_class my_type vals
1       seg_a  class_1  type_1   86
2       seg_a  class_1  type_1   41
3       seg_a  class_2  type_1   23
4       seg_a  class_2  type_1   79
5       seg_a  class_3  type_1   33
6       seg_b  class_4  type_1   82
7       seg_b  class_4  type_1   85
8       seg_b  class_5  type_2   40
9       seg_b  class_5  type_2   83
10      seg_b  class_6  type_2   69
11      seg_b  class_6  type_2   98
12      seg_c  class_7  type_2   91
13      seg_c  class_8  type_2   33

树图包运行良好,但在 RStudio 中产生不可读的输出,我希望能够使用 ggplot 进行更多自定义(类似于链接的文章)

# Run treemap function
tree_p <- treemap(
  tree_data,
  index            = c("my_segment", "my_class", "my_type"),
  vColor           = "my_segment",
  vSize            = "vals",
  type             = "index",
  fontsize.labels  = c(15, 12, 10),
  fontcolor.labels = c("white", "orange", "green"),
  fontface.labels  = c(2, 1, 1),
  bg.labels        = 0,
  align.labels     = list(
    c("center", "center"),
    c("right", "bottom"),
    c("left", "bottom")
  ),
  overlap.labels   = 0.5,
  inflate.labels   = FALSE
)
    
# Note:  unreadable output in Rstudio (too small)

使用此博客中的解决方法,但添加额外的层次结构并想要更改标签是问题所在。

# Create the plot in ggplot using geom_rect

# Get underlying data created from running treemap
tm_plot_data <- tree_p$tm %>% 
  mutate(x1 = x0 + w,
         y1 = y0 + h) %>% 
  mutate(x = (x0+x1)/2,
         y = (y0+y1)/2) %>% 
  mutate(
    primary_group = case_when(
      level == 1 ~ 1.5,
      level == 2 ~ 0.75,
      TRUE       ~ 0.5
    )
  ) 



# Plot
ggplot(tm_plot_data, aes(xmin = x0, ymin = y0, xmax = x1, ymax = y1)) + 
  # add fill and borders for groups and subgroups
  geom_rect(aes(fill = color, size = primary_group),
            show.legend = FALSE,
            color       = "black",
            alpha       = 0.3
  ) +
  scale_fill_identity() +
  # set thicker lines for group borders
  scale_size(range = range(tm_plot_data$primary_group)) +
  # add labels
  ggfittext::geom_fit_text(aes(label = my_segment), color = "white", min.size = 1) +
  ggfittext::geom_fit_text(aes(label = my_class), color = "blue", min.size = 1) +
  ggfittext::geom_fit_text(aes(label = my_type), color = "red", min.size = 1) +
  # options
  scale_x_continuous(expand = c(0, 0)) +
  scale_y_continuous(expand = c(0, 0)) +
  theme_void()

所以我的问题是有没有办法创建像树图这样的标签?具体来说,seg_aseg_bseg_c 应该只出现一次,并以各自段的区域为中心。我还想移动标签,使它们不重叠

感谢您的帮助和建议!

问题是您使用完整数据集 tm_plot_data 来添加标签。因此,对于每个上层,您都会获得多个标签。要解决此问题,请聚合您的数据集并将这些数据集作为 data 传递给 ggfittext::geom_fit_text。要处理重叠标签,您可以例如使用 ggfittext::geom_fit_textplace 参数将 class 标签移动到左下角,将类型标签移动到右上角。

library(tidyverse)
library(treemap)

set.seed(123)

tm_seg <- tm_plot_data %>% 
  group_by(my_segment) %>% 
  summarise(x0 = min(x0), y0 = min(y0), y1 = max(y1), x1 = max(x1)) %>% 
  ungroup()

tm_class <- tm_plot_data %>% 
  group_by(my_segment, my_class) %>% 
  summarise(x0 = min(x0), y0 = min(y0), y1 = max(y1), x1 = max(x1)) %>% 
  ungroup()

tm_type <- tm_plot_data %>% 
  group_by(my_segment, my_class, my_type) %>% 
  summarise(x0 = min(x0), y0 = min(y0), y1 = max(y1), x1 = max(x1)) %>% 
  ungroup()

# Plot
ggplot(tm_plot_data, aes(xmin = x0, ymin = y0, xmax = x1, ymax = y1)) +
  # add fill and borders for groups and subgroups
  geom_rect(aes(fill = color, size = primary_group),
    show.legend = FALSE,
    color       = "black",
    alpha       = 0.3
  ) +
  scale_fill_identity() +
  # set thicker lines for group borders
  scale_size(range = range(tm_plot_data$primary_group)) +
  # add labels
  ggfittext::geom_fit_text(data = tm_seg, aes(label = my_segment), color = "white", min.size = 4) +
  ggfittext::geom_fit_text(data = tm_class, aes(label = my_class), color = "blue", min.size = 1, place = "bottomleft") +
  ggfittext::geom_fit_text(data = tm_type, aes(label = my_type), color = "red", min.size = 1, place = "topright") +
  # options
  scale_x_continuous(expand = c(0, 0)) +
  scale_y_continuous(expand = c(0, 0)) +
  theme_void()
#> Warning: Removed 3 rows containing missing values (geom_fit_text).
#> Warning: Removed 12 rows containing missing values (geom_fit_text).