如何根据左相邻列中的值替换多列中的值
How to replace values in multiple columns based on value from left-adjacent column
我有类似的数据(虽然数据集更大):
correct.trial1 RT.trial1 correct.trial2 RT.trial2 correct.trial3 RT.trial3
1 1 473 0 337 1 426
2 1 496 1 407 1 421
3 1 368 0 405 1 470
4 0 333 1 475 0 473
5 0 435 0 312 1 402
我们可以用这个来制作这个样本:
set.seed(12)
df <- data.frame(correct.trial1 = sample(0:1, 5, replace=T),
RT.trial1 = sample(300:500, 5, replace=T),
correct.trial2 = sample(0:1, 5, replace=T),
RT.trial2 = sample(300:500, 5, replace=T),
correct.trial3 = sample(0:1, 5, replace=T),
RT.trial3 = sample(300:500, 5, replace=T))
当相邻(左)列 starts_with("correct.trial")
的值为 0 时,我想将值 starts_with("RT.trial")
替换为 NA
。当然,我可以一次一个地进行,例如:
library(dplyr)
df %>%
mutate(RT.trial1 = ifelse(correct.trial1==1, RT.trial1, NA),
RT.trial2 = ifelse(correct.trial2==1, RT.trial2, NA),
RT.trial3 = ifelse(correct.trial3==1, RT.trial3, NA))
所以它看起来像这样:
correct.trial1 RT.trial1 correct.trial2 RT.trial2 correct.trial3 RT.trial3
1 1 473 0 NA 1 426
2 1 496 1 407 1 421
3 1 368 0 NA 1 470
4 0 NA 1 475 0 NA
5 0 NA 0 NA 1 402
但这对于数千列来说是不切实际的。
问题
如何同时对所有列执行此操作? (注意:我更喜欢 dplyr
解决方案,使用 across
比使用 mutate_at
更可取。)
尝试
不确定,但基于此 ,它(可能)看起来像这样:
df %>%
mutate_at(vars(starts_with("RT.trial")),
~ifelse(vars(starts_with("correct.trial"))==0, NA, .x))
我们可以重塑为 'long' 格式,然后进行转换
library(dplyr)
library(tidyr)
df %>%
mutate(rn = row_number()) %>%
pivot_longer(cols = -rn, names_to = c(".value", "grp"),
names_sep="\.") %>%
mutate(RT = case_when(as.logical(correct) ~ RT)) %>%
pivot_wider(names_from = grp, values_from = c(correct, RT),
names_sep = ".") %>%
select(names(df))
-输出
# A tibble: 5 x 6
# correct.trial1 RT.trial1 correct.trial2 RT.trial2 correct.trial3 RT.trial3
# <int> <int> <int> <int> <int> <int>
#1 0 NA 0 NA 0 NA
#2 1 394 1 458 0 NA
#3 0 NA 1 337 0 NA
#4 1 479 0 NA 0 NA
#5 0 NA 0 NA 0 NA
在 base R
中,这可以通过更简单的方式完成
i1 <- grepl('correct', names(df))
df[!i1] <- (NA^!df[i1]) * df[!i1]
数据
df <- structure(list(correct.trial1 = c(0L, 1L, 0L, 1L, 0L), RT.trial1 = c(417L,
394L, 345L, 479L, 368L), correct.trial2 = c(0L, 1L, 1L, 0L, 0L
), RT.trial2 = c(382L, 458L, 337L, 406L, 306L), correct.trial3 = c(0L,
0L, 0L, 0L, 0L), RT.trial3 = c(469L, 364L, 361L, 359L, 309L)),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
如果您想坚持使用 tidyverse
,这里还有一个选择:
library(dplyr)
purrr::map2_dfc(df %>% select(starts_with('RT')),
df %>% select(starts_with('correct')),
~if_else(.y == 0, NA_integer_, .x)) %>%
bind_cols(df %>% select(starts_with('correct'))) %>%
#To get correct order of columns
select(order(as.numeric(sub('\D+', '', names(.)))))
# RT.trial1 correct.trial1 RT.trial2 correct.trial2 RT.trial3 correct.trial3
# <int> <int> <int> <int> <int> <int>
#1 473 1 NA 0 426 1
#2 496 1 407 1 421 1
#3 368 1 NA 0 470 1
#4 NA 0 475 1 NA 0
#5 NA 0 NA 0 402 1
这也行。这是使用 across 最简单的方法。
library(tidyverse)
df %>%
mutate(across(starts_with("RT.trial"), ~ if_else(get(str_c("correct.trial", str_sub(cur_column(), -1))) == 0, NA_integer_, .)))
这给出:
correct.trial1 RT.trial1 correct.trial2 RT.trial2 correct.trial3 RT.trial3
1 1 473 0 NA 1 426
2 1 496 1 407 1 421
3 1 368 0 NA 1 470
4 0 NA 1 475 0 NA
5 0 NA 0 NA 1 402
我有类似的数据(虽然数据集更大):
correct.trial1 RT.trial1 correct.trial2 RT.trial2 correct.trial3 RT.trial3
1 1 473 0 337 1 426
2 1 496 1 407 1 421
3 1 368 0 405 1 470
4 0 333 1 475 0 473
5 0 435 0 312 1 402
我们可以用这个来制作这个样本:
set.seed(12)
df <- data.frame(correct.trial1 = sample(0:1, 5, replace=T),
RT.trial1 = sample(300:500, 5, replace=T),
correct.trial2 = sample(0:1, 5, replace=T),
RT.trial2 = sample(300:500, 5, replace=T),
correct.trial3 = sample(0:1, 5, replace=T),
RT.trial3 = sample(300:500, 5, replace=T))
当相邻(左)列 starts_with("correct.trial")
的值为 0 时,我想将值 starts_with("RT.trial")
替换为 NA
。当然,我可以一次一个地进行,例如:
library(dplyr)
df %>%
mutate(RT.trial1 = ifelse(correct.trial1==1, RT.trial1, NA),
RT.trial2 = ifelse(correct.trial2==1, RT.trial2, NA),
RT.trial3 = ifelse(correct.trial3==1, RT.trial3, NA))
所以它看起来像这样:
correct.trial1 RT.trial1 correct.trial2 RT.trial2 correct.trial3 RT.trial3
1 1 473 0 NA 1 426
2 1 496 1 407 1 421
3 1 368 0 NA 1 470
4 0 NA 1 475 0 NA
5 0 NA 0 NA 1 402
但这对于数千列来说是不切实际的。
问题
如何同时对所有列执行此操作? (注意:我更喜欢 dplyr
解决方案,使用 across
比使用 mutate_at
更可取。)
尝试
不确定,但基于此
df %>%
mutate_at(vars(starts_with("RT.trial")),
~ifelse(vars(starts_with("correct.trial"))==0, NA, .x))
我们可以重塑为 'long' 格式,然后进行转换
library(dplyr)
library(tidyr)
df %>%
mutate(rn = row_number()) %>%
pivot_longer(cols = -rn, names_to = c(".value", "grp"),
names_sep="\.") %>%
mutate(RT = case_when(as.logical(correct) ~ RT)) %>%
pivot_wider(names_from = grp, values_from = c(correct, RT),
names_sep = ".") %>%
select(names(df))
-输出
# A tibble: 5 x 6
# correct.trial1 RT.trial1 correct.trial2 RT.trial2 correct.trial3 RT.trial3
# <int> <int> <int> <int> <int> <int>
#1 0 NA 0 NA 0 NA
#2 1 394 1 458 0 NA
#3 0 NA 1 337 0 NA
#4 1 479 0 NA 0 NA
#5 0 NA 0 NA 0 NA
在 base R
中,这可以通过更简单的方式完成
i1 <- grepl('correct', names(df))
df[!i1] <- (NA^!df[i1]) * df[!i1]
数据
df <- structure(list(correct.trial1 = c(0L, 1L, 0L, 1L, 0L), RT.trial1 = c(417L,
394L, 345L, 479L, 368L), correct.trial2 = c(0L, 1L, 1L, 0L, 0L
), RT.trial2 = c(382L, 458L, 337L, 406L, 306L), correct.trial3 = c(0L,
0L, 0L, 0L, 0L), RT.trial3 = c(469L, 364L, 361L, 359L, 309L)),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
如果您想坚持使用 tidyverse
,这里还有一个选择:
library(dplyr)
purrr::map2_dfc(df %>% select(starts_with('RT')),
df %>% select(starts_with('correct')),
~if_else(.y == 0, NA_integer_, .x)) %>%
bind_cols(df %>% select(starts_with('correct'))) %>%
#To get correct order of columns
select(order(as.numeric(sub('\D+', '', names(.)))))
# RT.trial1 correct.trial1 RT.trial2 correct.trial2 RT.trial3 correct.trial3
# <int> <int> <int> <int> <int> <int>
#1 473 1 NA 0 426 1
#2 496 1 407 1 421 1
#3 368 1 NA 0 470 1
#4 NA 0 475 1 NA 0
#5 NA 0 NA 0 402 1
这也行。这是使用 across 最简单的方法。
library(tidyverse)
df %>%
mutate(across(starts_with("RT.trial"), ~ if_else(get(str_c("correct.trial", str_sub(cur_column(), -1))) == 0, NA_integer_, .)))
这给出:
correct.trial1 RT.trial1 correct.trial2 RT.trial2 correct.trial3 RT.trial3
1 1 473 0 NA 1 426
2 1 496 1 407 1 421
3 1 368 0 NA 1 470
4 0 NA 1 475 0 NA
5 0 NA 0 NA 1 402