重塑数据框,使匹配的家庭成员有自己的列
Reshape dataframe so that matching family members have their own column
我有一个数据框...
df <- tibble(
id = 1:5,
family = c("a","a","b","b","c"),
twin = c(1,2,1,2,1),
datacol1 = 11:15,
datacol2 = 21:25
)
对于每对双胞胎(同一家庭的成员),我需要第二个 'datacol' 与其他双胞胎的数据。这应该只发生在匹配的双胞胎上,所以第 5 行(来自家庭 "c")应该有重复的空列。
理想情况下,到最后数据将如下所示...
df <- tibble(
id = 1:5,
family = c("a","a","b","b","c"),
twin = c(1,2,1,2,1),
datacol1 = 11:15,
datacol1.b = c(12,11,14,13,NA),
datacol2 = 21:25,
datacol2.b = c(22,21,24,23,NA)
)
我添加了一张图片来帮助说明我想要达到的目的。
我希望能够对所有列或选定列执行此操作,最好使用 tidyverse。
cols = c("datacol1", "datacol2")
df %>%
group_by(family) %>%
mutate_at(vars(cols), function(x){
if (n() == 2){
rev(x)
} else {
NA
}
}) %>%
ungroup() %>%
select(cols) %>%
rename_all(funs(paste0(., ".b"))) %>%
cbind(df, .)
基础 R
cols = c("datacol1", "datacol2")
do.call(rbind, lapply(split(df, df$family), function(x){
cbind(x, setNames(lapply(x[cols], function(y) {
if (length(y) == 2) {
rev(y)
} else {
NA
}}),
paste0(cols, ".b")))
}))
我们也可以用mutate_at
library(dplyr)
df %>%
group_by(family) %>%
mutate_at(vars(starts_with('datacol')), list(`2` =
~if(n() == 1) NA_integer_ else rev(.)))
# A tibble: 5 x 7
# Groups: family [3]
# id family twin datacol1 datacol2 datacol1_2 datacol2_2
# <int> <chr> <dbl> <int> <int> <int> <int>
#1 1 a 1 11 21 12 22
#2 2 a 2 12 22 11 21
#3 3 b 1 13 23 14 24
#4 4 b 2 14 24 13 23
#5 5 c 1 15 25 NA NA
我有一个数据框...
df <- tibble(
id = 1:5,
family = c("a","a","b","b","c"),
twin = c(1,2,1,2,1),
datacol1 = 11:15,
datacol2 = 21:25
)
对于每对双胞胎(同一家庭的成员),我需要第二个 'datacol' 与其他双胞胎的数据。这应该只发生在匹配的双胞胎上,所以第 5 行(来自家庭 "c")应该有重复的空列。
理想情况下,到最后数据将如下所示...
df <- tibble(
id = 1:5,
family = c("a","a","b","b","c"),
twin = c(1,2,1,2,1),
datacol1 = 11:15,
datacol1.b = c(12,11,14,13,NA),
datacol2 = 21:25,
datacol2.b = c(22,21,24,23,NA)
)
我添加了一张图片来帮助说明我想要达到的目的。
我希望能够对所有列或选定列执行此操作,最好使用 tidyverse。
cols = c("datacol1", "datacol2")
df %>%
group_by(family) %>%
mutate_at(vars(cols), function(x){
if (n() == 2){
rev(x)
} else {
NA
}
}) %>%
ungroup() %>%
select(cols) %>%
rename_all(funs(paste0(., ".b"))) %>%
cbind(df, .)
基础 R
cols = c("datacol1", "datacol2")
do.call(rbind, lapply(split(df, df$family), function(x){
cbind(x, setNames(lapply(x[cols], function(y) {
if (length(y) == 2) {
rev(y)
} else {
NA
}}),
paste0(cols, ".b")))
}))
我们也可以用mutate_at
library(dplyr)
df %>%
group_by(family) %>%
mutate_at(vars(starts_with('datacol')), list(`2` =
~if(n() == 1) NA_integer_ else rev(.)))
# A tibble: 5 x 7
# Groups: family [3]
# id family twin datacol1 datacol2 datacol1_2 datacol2_2
# <int> <chr> <dbl> <int> <int> <int> <int>
#1 1 a 1 11 21 12 22
#2 2 a 2 12 22 11 21
#3 3 b 1 13 23 14 24
#4 4 b 2 14 24 13 23
#5 5 c 1 15 25 NA NA