R ggplot2 ggrepel - 在了解所有点的同时标记点的子集
R ggplot2 ggrepel - label a subset of points while being aware of all points
我用 R 'ggplot2' 构建了一个相当密集的散点图,我想使用 'ggrepel' 标记点的子集。我的问题是我想绘制散点图中的所有点,但只用 ggrepel 标记一个子集,当我这样做时,ggrepel 在计算放置标签的位置时不考虑图上的其他点,这导致与图上其他点重叠的标签(我不想标记)。
这是一个说明问题的示例图。
# generate data:
library(data.table)
library(stringi)
set.seed(20180918)
dt = data.table(
name = stri_rand_strings(3000,length=6),
one = rnorm(n = 3000,mean = 0,sd = 1),
two = rnorm(n = 3000,mean = 0,sd = 1))
dt[, diff := one -two]
dt[, diff_cat := ifelse(one > 0 & two>0 & abs(diff)>1, "type_1",
ifelse(one<0 & two < 0 & abs(diff)>1, "type_2",
ifelse(two>0 & one<0 & abs(diff)>1, "type_3",
ifelse(two<0 & one>0 & abs(diff)>1, "type_4", "other"))))]
# make plot
ggplot(dt, aes(x=one,y=two,color=diff_cat))+
geom_point()
如果我只绘制我想要标记的点的子集,那么 ggrepel 能够以相对于其他点和标签的非重叠方式放置所有标签。
ggplot(dt[abs(diff)>2 & (!diff_cat %in% c("type_3","type_4","other"))],
aes(x=one,y=two,color=diff_cat))+
geom_point()+
geom_text_repel(data = dt[abs(diff)>2 & (!diff_cat %in% c("type_3","type_4","other"))],
aes(x=one,y=two,label=name))
但是,当我想同时绘制此数据子集和原始数据时,我得到了带有标签的重叠点:
# now add labels to a subset of points on the plot
ggplot(dt, aes(x=one,y=two,color=diff_cat))+
geom_point()+
geom_text_repel(data = dt[abs(diff)>2 & (!diff_cat %in% c("type_3","type_4","other"))],
aes(x=one,y=two,label=name))
如何让点子集的标签不与原始数据中的点重叠?
您可以尝试以下方法:
- 为原始数据中的所有其他点分配一个空白标签 (
""
),以便 geom_text_repel
在相互排斥标签时将它们考虑在内;
- 将
box.padding
参数从默认值 0.25
增加到某个更大的值,以增加标签之间的距离;
- 增加x轴和y轴的限制,让标签在四边更多space排斥。
示例代码(box.padding = 1
):
ggplot(dt,
aes(x = one, y = two, color = diff_cat)) +
geom_point() +
geom_text_repel(data = . %>%
mutate(label = ifelse(diff_cat %in% c("type_1", "type_2") & abs(diff) > 2,
name, "")),
aes(label = label),
box.padding = 1,
show.legend = FALSE) + #this removes the 'a' from the legend
coord_cartesian(xlim = c(-5, 5), ylim = c(-5, 5)) +
theme_bw()
这是另一个尝试, box.padding = 2
:
(注意:我使用的是 ggrepel 0.8.0。我不确定是否所有功能都适用于早期的软件包版本。)
我用 R 'ggplot2' 构建了一个相当密集的散点图,我想使用 'ggrepel' 标记点的子集。我的问题是我想绘制散点图中的所有点,但只用 ggrepel 标记一个子集,当我这样做时,ggrepel 在计算放置标签的位置时不考虑图上的其他点,这导致与图上其他点重叠的标签(我不想标记)。
这是一个说明问题的示例图。
# generate data:
library(data.table)
library(stringi)
set.seed(20180918)
dt = data.table(
name = stri_rand_strings(3000,length=6),
one = rnorm(n = 3000,mean = 0,sd = 1),
two = rnorm(n = 3000,mean = 0,sd = 1))
dt[, diff := one -two]
dt[, diff_cat := ifelse(one > 0 & two>0 & abs(diff)>1, "type_1",
ifelse(one<0 & two < 0 & abs(diff)>1, "type_2",
ifelse(two>0 & one<0 & abs(diff)>1, "type_3",
ifelse(two<0 & one>0 & abs(diff)>1, "type_4", "other"))))]
# make plot
ggplot(dt, aes(x=one,y=two,color=diff_cat))+
geom_point()
如果我只绘制我想要标记的点的子集,那么 ggrepel 能够以相对于其他点和标签的非重叠方式放置所有标签。
ggplot(dt[abs(diff)>2 & (!diff_cat %in% c("type_3","type_4","other"))],
aes(x=one,y=two,color=diff_cat))+
geom_point()+
geom_text_repel(data = dt[abs(diff)>2 & (!diff_cat %in% c("type_3","type_4","other"))],
aes(x=one,y=two,label=name))
但是,当我想同时绘制此数据子集和原始数据时,我得到了带有标签的重叠点:
# now add labels to a subset of points on the plot
ggplot(dt, aes(x=one,y=two,color=diff_cat))+
geom_point()+
geom_text_repel(data = dt[abs(diff)>2 & (!diff_cat %in% c("type_3","type_4","other"))],
aes(x=one,y=two,label=name))
如何让点子集的标签不与原始数据中的点重叠?
您可以尝试以下方法:
- 为原始数据中的所有其他点分配一个空白标签 (
""
),以便geom_text_repel
在相互排斥标签时将它们考虑在内; - 将
box.padding
参数从默认值0.25
增加到某个更大的值,以增加标签之间的距离; - 增加x轴和y轴的限制,让标签在四边更多space排斥。
示例代码(box.padding = 1
):
ggplot(dt,
aes(x = one, y = two, color = diff_cat)) +
geom_point() +
geom_text_repel(data = . %>%
mutate(label = ifelse(diff_cat %in% c("type_1", "type_2") & abs(diff) > 2,
name, "")),
aes(label = label),
box.padding = 1,
show.legend = FALSE) + #this removes the 'a' from the legend
coord_cartesian(xlim = c(-5, 5), ylim = c(-5, 5)) +
theme_bw()
这是另一个尝试, box.padding = 2
:
(注意:我使用的是 ggrepel 0.8.0。我不确定是否所有功能都适用于早期的软件包版本。)