R,ggplot,使用抖动时如何将相关点保持在一起?
R, ggplot, How do I keep related points together when using jitter?
我的数据框中的一个变量是表示获得或花费金额的因素。每个事件都有一个“增益”值;可能有也可能没有相应的“支出”金额。这是一张带有过度绘制的观察结果的图像:
添加一些随机抖动有助于视觉效果,但是,“支出”金额与其相应的增益事件脱节:
我希望看到蓝色圆圈在它们的增益圆圈(其中“id”相等)中“瞄准”,并且成对抖动。以下是一些示例数据(三天)和代码:
library(ggplot2)
ccode<-c(Gain="darkseagreen",Spend="darkblue")
ef<-data.frame(
date=as.Date(c("2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03")),
site=c("Castle","Temple","Temple","Temple","Temple","Temple","Palace","Palace","Castle","Castle","Castle","Temple","Temple","Palace","Palace","Castle","Castle","Castle","Castle","Castle","Temple","Temple","Palace","Castle","Temple","Temple","Temple","Temple","Temple","Palace","Palace","Castle","Castle","Castle","Temple","Temple","Palace","Palace","Castle","Castle","Castle","Castle","Castle","Temple","Temple","Palace"),
id=c("C123","T101","T93","T94","T95","T96","P102","P96","C126","C127","C128","T100","T98","P100","P98","C129","C130","C131","C132","C133","T104","T99","P99","C123","T101","T93","T94","T95","T96","P102","P96","C126","C127","C128","T100","T98","P100","P98","C129","C130","C131","C132","C133","T104","T99","P99"),
gainspend=c("Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend"),
amount=c(6,14,34,31,3,10,6,14,2,16,16,14,1,1,15,11,8,7,2,10,15,4,3,NA,NA,4,5,NA,NA,NA,NA,NA,NA,2,NA,1,NA,3,NA,NA,2,NA,NA,2,NA,3))
#▼ 3 day, points centered
ggplot(ef,aes(date,site)) +
geom_point(aes(size=amount,color=gainspend),alpha=0.5) +
scale_color_manual(values=ccode) +
scale_size_continuous(range=c(1,15),breaks=c(5,10,20))
#▼ 3 day, jitted
ggplot(ef,aes(date,site)) +
geom_point(aes(size=amount,color=gainspend),alpha=0.5,position=position_jitter(w=0,h=0.2)) +
scale_color_manual(values=ccode) +
scale_size_continuous(range=c(1,15),breaks=c(5,10,20))
我的主要想法是旧的“手动添加抖动”方法。我想知道是否有更好的方法可以像绘制小饼图作为点 la package scatterpie.
在这种情况下,您可以为每个 ID 的抖动量添加一个随机数,这样组内的点将移动相同的量。这需要在 ggplot2.
之外进行工作
首先画出为每个ID添加的“抖动”。由于分类轴是 1 个单位宽,我选择 -.3 和 .3 之间的数字。我使用 dplyr 来完成这项工作并设置种子,这样你会得到相同的结果。
library(dplyr)
set.seed(16)
ef2 = ef %>%
group_by(id) %>%
mutate(jitter = runif(1, min = -.3, max = .3)) %>%
ungroup()
接下来是剧情。我使用 geom_blank()
图层,以便在添加抖动之前绘制分类 site
轴。我将 site
从一个因子转换为数字并添加抖动;这仅适用于因子,幸运的是 ggplot2 中的分类轴基于因子。
现在配对的 ID 一起移动。
ggplot(ef2, aes(x = date, y = site)) +
geom_blank() +
geom_point(aes(size = amount, color = gainspend,
y = as.numeric(factor(site)) + jitter),
alpha=0.5) +
scale_color_manual(values = ccode) +
scale_size_continuous(range = c(1, 15), breaks = c(5, 10, 20))
#> Warning: Removed 15 rows containing missing values (geom_point).
由 reprex package (v2.0.0)
于 2021-09-23 创建
您可以通过 id
在 ggplot() 调用之外添加一些抖动。
jj <- data.frame(id = unique(ef$id), jtr = runif(nrow(ef), -0.3, 0.3))
ef <- merge(ef, jj, by = 'id')
ef$sitej <- as.numeric(factor(ef$site)) + ef$jtr
但是您需要 site
integer/numeric 才能执行此操作。所以在做图的时候,需要手动添加坐标轴标签scale_y_continuous()
。 (更新:上面 aosmith 的 geom_blank()
技巧是更好的解决方案!)
ggplot(ef,aes(date,sitej)) +
geom_point(aes(size=amount,color=gainspend),alpha=0.5) +
scale_color_manual(values=ccode) +
scale_size_continuous(range=c(1,15),breaks=c(5,10,20)) +
scale_y_continuous(breaks = 1:3, labels= sort(unique(ef$site)))
这似乎可行,但仍然有几个 gain/spend 圈子没有合作伙伴 -- 也许 id
变量有问题。
也许其他人有更好的方法!
我的数据框中的一个变量是表示获得或花费金额的因素。每个事件都有一个“增益”值;可能有也可能没有相应的“支出”金额。这是一张带有过度绘制的观察结果的图像:
添加一些随机抖动有助于视觉效果,但是,“支出”金额与其相应的增益事件脱节:
我希望看到蓝色圆圈在它们的增益圆圈(其中“id”相等)中“瞄准”,并且成对抖动。以下是一些示例数据(三天)和代码:
library(ggplot2)
ccode<-c(Gain="darkseagreen",Spend="darkblue")
ef<-data.frame(
date=as.Date(c("2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-01","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-02","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03","2021-03-03")),
site=c("Castle","Temple","Temple","Temple","Temple","Temple","Palace","Palace","Castle","Castle","Castle","Temple","Temple","Palace","Palace","Castle","Castle","Castle","Castle","Castle","Temple","Temple","Palace","Castle","Temple","Temple","Temple","Temple","Temple","Palace","Palace","Castle","Castle","Castle","Temple","Temple","Palace","Palace","Castle","Castle","Castle","Castle","Castle","Temple","Temple","Palace"),
id=c("C123","T101","T93","T94","T95","T96","P102","P96","C126","C127","C128","T100","T98","P100","P98","C129","C130","C131","C132","C133","T104","T99","P99","C123","T101","T93","T94","T95","T96","P102","P96","C126","C127","C128","T100","T98","P100","P98","C129","C130","C131","C132","C133","T104","T99","P99"),
gainspend=c("Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Gain","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend","Spend"),
amount=c(6,14,34,31,3,10,6,14,2,16,16,14,1,1,15,11,8,7,2,10,15,4,3,NA,NA,4,5,NA,NA,NA,NA,NA,NA,2,NA,1,NA,3,NA,NA,2,NA,NA,2,NA,3))
#▼ 3 day, points centered
ggplot(ef,aes(date,site)) +
geom_point(aes(size=amount,color=gainspend),alpha=0.5) +
scale_color_manual(values=ccode) +
scale_size_continuous(range=c(1,15),breaks=c(5,10,20))
#▼ 3 day, jitted
ggplot(ef,aes(date,site)) +
geom_point(aes(size=amount,color=gainspend),alpha=0.5,position=position_jitter(w=0,h=0.2)) +
scale_color_manual(values=ccode) +
scale_size_continuous(range=c(1,15),breaks=c(5,10,20))
我的主要想法是旧的“手动添加抖动”方法。我想知道是否有更好的方法可以像绘制小饼图作为点 la package scatterpie.
在这种情况下,您可以为每个 ID 的抖动量添加一个随机数,这样组内的点将移动相同的量。这需要在 ggplot2.
之外进行工作首先画出为每个ID添加的“抖动”。由于分类轴是 1 个单位宽,我选择 -.3 和 .3 之间的数字。我使用 dplyr 来完成这项工作并设置种子,这样你会得到相同的结果。
library(dplyr)
set.seed(16)
ef2 = ef %>%
group_by(id) %>%
mutate(jitter = runif(1, min = -.3, max = .3)) %>%
ungroup()
接下来是剧情。我使用 geom_blank()
图层,以便在添加抖动之前绘制分类 site
轴。我将 site
从一个因子转换为数字并添加抖动;这仅适用于因子,幸运的是 ggplot2 中的分类轴基于因子。
现在配对的 ID 一起移动。
ggplot(ef2, aes(x = date, y = site)) +
geom_blank() +
geom_point(aes(size = amount, color = gainspend,
y = as.numeric(factor(site)) + jitter),
alpha=0.5) +
scale_color_manual(values = ccode) +
scale_size_continuous(range = c(1, 15), breaks = c(5, 10, 20))
#> Warning: Removed 15 rows containing missing values (geom_point).
由 reprex package (v2.0.0)
于 2021-09-23 创建您可以通过 id
在 ggplot() 调用之外添加一些抖动。
jj <- data.frame(id = unique(ef$id), jtr = runif(nrow(ef), -0.3, 0.3))
ef <- merge(ef, jj, by = 'id')
ef$sitej <- as.numeric(factor(ef$site)) + ef$jtr
但是您需要 site
integer/numeric 才能执行此操作。所以在做图的时候,需要手动添加坐标轴标签scale_y_continuous()
。 (更新:上面 aosmith 的 geom_blank()
技巧是更好的解决方案!)
ggplot(ef,aes(date,sitej)) +
geom_point(aes(size=amount,color=gainspend),alpha=0.5) +
scale_color_manual(values=ccode) +
scale_size_continuous(range=c(1,15),breaks=c(5,10,20)) +
scale_y_continuous(breaks = 1:3, labels= sort(unique(ef$site)))
这似乎可行,但仍然有几个 gain/spend 圈子没有合作伙伴 -- 也许 id
变量有问题。
也许其他人有更好的方法!