如何在 R 中使用特定匹配变量复制行中的响应
How to duplicate responses in rows with specific matching variables in R
我的数据集看起来像这样:
ID block question latency response rating
425 1 1 3452 my response 3
425 1 2 6427 1
425 1 3 4630 5
425 1 4 5319 2
425 1 5 2501 2
425 2 1 4205 another response 4
并非所有参与者都完成了相同数量的 blocks/questions。
我只想将响应复制到空 'response' 单元格以及 ID 和区块编号都与现有响应相匹配的位置,如下所示:
ID block question latency response rating
425 1 1 3452 my response 3
425 1 2 6427 my response 1
425 1 3 4630 my response 5
425 1 4 5319 my response 2
425 1 5 2501 my response 2
425 2 1 4205 another response 4
你可以尝试这样的事情。它首先用 NA
s 替换缺失的字符串,然后用 fill
.
填充值
library(dplyr)
library(tidyr)
df %>%
group_by(ID,block) %>%
mutate(response=ifelse(response=="",NA,response)) %>%
fill(response, .direction="down") %>%
ungroup()
# A tibble: 6 x 6
ID block question latency response rating
<dbl> <dbl> <dbl> <dbl> <chr> <int>
1 425 1 1 3452 my response 3
2 425 1 2 6427 my response 1
3 425 1 3 4630 my response 5
4 425 1 4 5319 my response 2
5 425 1 5 2501 my response 2
6 425 2 1 4205 another response 4
数据
df <- structure(list(ID = c(425, 425, 425, 425, 425, 425), block = c(1,
1, 1, 1, 1, 2), question = c(1, 2, 3, 4, 5, 1), latency = c(3452,
6427, 4630, 5319, 2501, 4205), response = c("my response", "",
"", "", "", "another response"), rating = c(3L, 1L, 5L, 2L, 2L,
4L)), row.names = c(NA, -6L), class = "data.frame")
我的数据集看起来像这样:
ID block question latency response rating
425 1 1 3452 my response 3
425 1 2 6427 1
425 1 3 4630 5
425 1 4 5319 2
425 1 5 2501 2
425 2 1 4205 another response 4
并非所有参与者都完成了相同数量的 blocks/questions。
我只想将响应复制到空 'response' 单元格以及 ID 和区块编号都与现有响应相匹配的位置,如下所示:
ID block question latency response rating
425 1 1 3452 my response 3
425 1 2 6427 my response 1
425 1 3 4630 my response 5
425 1 4 5319 my response 2
425 1 5 2501 my response 2
425 2 1 4205 another response 4
你可以尝试这样的事情。它首先用 NA
s 替换缺失的字符串,然后用 fill
.
library(dplyr)
library(tidyr)
df %>%
group_by(ID,block) %>%
mutate(response=ifelse(response=="",NA,response)) %>%
fill(response, .direction="down") %>%
ungroup()
# A tibble: 6 x 6
ID block question latency response rating
<dbl> <dbl> <dbl> <dbl> <chr> <int>
1 425 1 1 3452 my response 3
2 425 1 2 6427 my response 1
3 425 1 3 4630 my response 5
4 425 1 4 5319 my response 2
5 425 1 5 2501 my response 2
6 425 2 1 4205 another response 4
数据
df <- structure(list(ID = c(425, 425, 425, 425, 425, 425), block = c(1,
1, 1, 1, 1, 2), question = c(1, 2, 3, 4, 5, 1), latency = c(3452,
6427, 4630, 5319, 2501, 4205), response = c("my response", "",
"", "", "", "another response"), rating = c(3L, 1L, 5L, 2L, 2L,
4L)), row.names = c(NA, -6L), class = "data.frame")