根据条件不同数量的抖动点
Jitter points by different amounts based on condition
我有一个包含离散 X 轴值和大量 Y 值的数据集。我还有一个单独的向量,其中包含 X 轴值的不确定性度量;这种不确定性在 X 轴上变化。我想抖动我的 X 轴值 与此不确定性度量 成正比的量。用循环来做这件事很简单但很麻烦;我正在寻找一个有效的解决方案。
可重现的例子:
#Create data frame with discrete X-axis values (a)
dat <- data.frame(a = c(rep(5, 5), rep(15,5), rep(25,5)),
b = c(runif(5, 1, 2), runif(5, 2, 3), runif(5, 3, 4)))
#Plot raw, unjittered data
plot(dat$b ~ dat$a, data = dat, col = as.factor(dat$a), pch = 20, cex = 2)
#vector of uncertainty estimates
wid_vec <- c(1,10,3)
#Ugly manual jittering, not feasible for large datasets but
#produces the desired result
dat$a_jit <- c(jitter(rep(5, 5), amount = 1),
jitter(rep(15, 5), amount = 10),
jitter(rep(25, 5), amount = 3))
plot(dat$b ~ dat$a_jit, col = as.factor(dat$a), pch = 20, cex = 2)
#Ugly loop solution, also works
newdat <- data.frame()
a_s <- unique(dat$a)
for (i in 1:length(a_s)){
subdat <- dat[dat$a == a_s[i],]
subdat$a_jit <- jitter(subdat$a, amount = wid_vec[i])
newdat <- rbind(newdat, subdat)
}
plot(newdat$b ~ newdat$a_jit, col = as.factor(newdat$a), pch = 20, cex = 2)
#Trying to make a vectorized solution, but this of course does not work.
jitter_custom <- function(x, wid){
j <- x + runif(length(x), -wid, wid)
j
}
#runif() does not work this way, this is shown to indicate the direction
#I've been attempting
基本上,我需要按条件拆分数据,调用wid_vec向量中的相关条目,然后根据wild_vec值修改数据条目来创建新列。听起来应该有一个优雅的 dplyr 解决方案来解决这个问题,但它现在让我望而却步。
感谢所有建议!
作为
的替代品
set.seed(1)
dat$a_jit <- c(jitter(rep(5, 5), amount = 1),
jitter(rep(15, 5), amount = 10),
jitter(rep(25, 5), amount = 3))
你可以
set.seed(1)
x <- with(dat, jitter(a, amount=setNames(c(1,10,3), unique(a))[as.character(a)]))
结果是一样的:
identical(x, dat$a_jit)
# [1] TRUE
如果您希望警告消失,您可以将 suppressWarnings()
环绕在 jitter(...)
周围,或者使用类似 with(dat, mapply(jitter, x=a, amount=setNames(c(1,10,3), unique(a))[as.character(a)]))
.
的内容
我有一个包含离散 X 轴值和大量 Y 值的数据集。我还有一个单独的向量,其中包含 X 轴值的不确定性度量;这种不确定性在 X 轴上变化。我想抖动我的 X 轴值 与此不确定性度量 成正比的量。用循环来做这件事很简单但很麻烦;我正在寻找一个有效的解决方案。
可重现的例子:
#Create data frame with discrete X-axis values (a)
dat <- data.frame(a = c(rep(5, 5), rep(15,5), rep(25,5)),
b = c(runif(5, 1, 2), runif(5, 2, 3), runif(5, 3, 4)))
#Plot raw, unjittered data
plot(dat$b ~ dat$a, data = dat, col = as.factor(dat$a), pch = 20, cex = 2)
#vector of uncertainty estimates
wid_vec <- c(1,10,3)
#Ugly manual jittering, not feasible for large datasets but
#produces the desired result
dat$a_jit <- c(jitter(rep(5, 5), amount = 1),
jitter(rep(15, 5), amount = 10),
jitter(rep(25, 5), amount = 3))
plot(dat$b ~ dat$a_jit, col = as.factor(dat$a), pch = 20, cex = 2)
#Ugly loop solution, also works
newdat <- data.frame()
a_s <- unique(dat$a)
for (i in 1:length(a_s)){
subdat <- dat[dat$a == a_s[i],]
subdat$a_jit <- jitter(subdat$a, amount = wid_vec[i])
newdat <- rbind(newdat, subdat)
}
plot(newdat$b ~ newdat$a_jit, col = as.factor(newdat$a), pch = 20, cex = 2)
#Trying to make a vectorized solution, but this of course does not work.
jitter_custom <- function(x, wid){
j <- x + runif(length(x), -wid, wid)
j
}
#runif() does not work this way, this is shown to indicate the direction
#I've been attempting
基本上,我需要按条件拆分数据,调用wid_vec向量中的相关条目,然后根据wild_vec值修改数据条目来创建新列。听起来应该有一个优雅的 dplyr 解决方案来解决这个问题,但它现在让我望而却步。
感谢所有建议!
作为
的替代品set.seed(1)
dat$a_jit <- c(jitter(rep(5, 5), amount = 1),
jitter(rep(15, 5), amount = 10),
jitter(rep(25, 5), amount = 3))
你可以
set.seed(1)
x <- with(dat, jitter(a, amount=setNames(c(1,10,3), unique(a))[as.character(a)]))
结果是一样的:
identical(x, dat$a_jit)
# [1] TRUE
如果您希望警告消失,您可以将 suppressWarnings()
环绕在 jitter(...)
周围,或者使用类似 with(dat, mapply(jitter, x=a, amount=setNames(c(1,10,3), unique(a))[as.character(a)]))
.