使用 r 中的条件将列转换为行
Transform columns to rows with a condition in r
我正在尝试通过考虑不同的值将两列合并为一列,如果有的话,将它们放在另一行中。这是我的数据集的样子。
df <- data.frame(
id = c(1,2,3),
role = c("A","B","C"),
grade.1 = c(3,4,5),
state.1 = c(1,NA,1),
grade.2 = c(4,4,5),
state.2 = c(1,1,NA),
grade.3 = c(3,4,5),
state.3 = c(1,1,NA))
> df
id role grade.1 state.1 grade.2 state.2 grade.3 state.3
1 1 A 3 1 4 1 3 1
2 2 B 4 NA 4 1 4 1
3 3 C 5 1 5 NA 5 NA
我需要将这些 grade.1
、grade.2
和 grade.3
列合并为一个 Grade
列。我尝试了 coalesce
但它丢失了 id
=1
信息,因为它在 grade.
列中有两个等级。另外,state.
映射也不起作用。
df <- df %>%
mutate(Grade = coalesce(grade.1, grade.2, grade.3))
> df
id role grade.1 state.1 grade.2 state.2 grade.3 state.3 Grade
1 1 A 3 1 4 1 3 1 3
2 2 B 4 NA 4 1 4 1 4
3 3 C 5 1 5 NA 5 NA 5
我想要的是为 id
=1
添加另一行,并通过添加二年级来放置第二行。我的预期数据集是:
> df.2
id role Grade state.1 state.2 state.3
1 1 A 3 1 NA 1
2 1 A 4 NA 1 NA
3 2 B 4 NA 1 1
4 3 C 5 1 NA NA
所以当一个id
有多个等级时,它需要放在不同的行中,state.
映射应该基于那个等级。
有任何想法吗?
谢谢!
一种方法-
- 获取长格式数据。
- 对于每个
id
,将 NA
添加到 state
中除当前列号之外的每个值。
- 对于
grade
列中的每个唯一值,获取非 NA 值。
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = starts_with('grade'),
values_to = 'grade', names_to = NULL) %>%
group_by(id) %>%
mutate(across(starts_with('state'),
~replace(., -as.numeric(sub('state.', '', cur_column(), fixed = TRUE)), NA))) %>%
group_by(id, role, grade) %>%
summarise(across(starts_with('state'), ~.x[!is.na(.x)][1]), .groups = 'drop')
# id role grade state.1 state.2 state.3
# <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
#1 1 A 3 1 NA 1
#2 1 A 4 NA 1 NA
#3 2 B 4 NA 1 1
#4 3 C 5 1 NA NA
我不确定这是否正是您要找的。或者如果这是最好的方法。但这是我目前所拥有的:
# read them in as 3 tables
df1 <- df[,.(id, role, grade = grade.1, state.1)]
df2 <- df[,.(id, role, grade = grade.2, state.2)]
df3 <- df[,.(id, role, grade = grade.3, state.3)]
# set the keys to do joins
setkey(df1, id, role, grade)
setkey(df2, id, role, grade)
setkey(df3, id, role, grade)
df_res <- rbind(
df1[df2[df3]],
df1[df3[df2]],
df2[df3[df1]],
df2[df1[df3]],
df3[df1[df2]],
df3[df2[df1]],
fill = T
)
unique(df_res)[order(id)]
> id role grade state.1 state.2 state.3
1: 1 A 3 1 NA 1
2: 1 A 4 NA 1 NA
3: 2 B 4 NA 1 1
4: 3 C 5 1 NA NA
我正在尝试通过考虑不同的值将两列合并为一列,如果有的话,将它们放在另一行中。这是我的数据集的样子。
df <- data.frame(
id = c(1,2,3),
role = c("A","B","C"),
grade.1 = c(3,4,5),
state.1 = c(1,NA,1),
grade.2 = c(4,4,5),
state.2 = c(1,1,NA),
grade.3 = c(3,4,5),
state.3 = c(1,1,NA))
> df
id role grade.1 state.1 grade.2 state.2 grade.3 state.3
1 1 A 3 1 4 1 3 1
2 2 B 4 NA 4 1 4 1
3 3 C 5 1 5 NA 5 NA
我需要将这些 grade.1
、grade.2
和 grade.3
列合并为一个 Grade
列。我尝试了 coalesce
但它丢失了 id
=1
信息,因为它在 grade.
列中有两个等级。另外,state.
映射也不起作用。
df <- df %>%
mutate(Grade = coalesce(grade.1, grade.2, grade.3))
> df
id role grade.1 state.1 grade.2 state.2 grade.3 state.3 Grade
1 1 A 3 1 4 1 3 1 3
2 2 B 4 NA 4 1 4 1 4
3 3 C 5 1 5 NA 5 NA 5
我想要的是为 id
=1
添加另一行,并通过添加二年级来放置第二行。我的预期数据集是:
> df.2
id role Grade state.1 state.2 state.3
1 1 A 3 1 NA 1
2 1 A 4 NA 1 NA
3 2 B 4 NA 1 1
4 3 C 5 1 NA NA
所以当一个id
有多个等级时,它需要放在不同的行中,state.
映射应该基于那个等级。
有任何想法吗?
谢谢!
一种方法-
- 获取长格式数据。
- 对于每个
id
,将NA
添加到state
中除当前列号之外的每个值。 - 对于
grade
列中的每个唯一值,获取非 NA 值。
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = starts_with('grade'),
values_to = 'grade', names_to = NULL) %>%
group_by(id) %>%
mutate(across(starts_with('state'),
~replace(., -as.numeric(sub('state.', '', cur_column(), fixed = TRUE)), NA))) %>%
group_by(id, role, grade) %>%
summarise(across(starts_with('state'), ~.x[!is.na(.x)][1]), .groups = 'drop')
# id role grade state.1 state.2 state.3
# <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
#1 1 A 3 1 NA 1
#2 1 A 4 NA 1 NA
#3 2 B 4 NA 1 1
#4 3 C 5 1 NA NA
我不确定这是否正是您要找的。或者如果这是最好的方法。但这是我目前所拥有的:
# read them in as 3 tables
df1 <- df[,.(id, role, grade = grade.1, state.1)]
df2 <- df[,.(id, role, grade = grade.2, state.2)]
df3 <- df[,.(id, role, grade = grade.3, state.3)]
# set the keys to do joins
setkey(df1, id, role, grade)
setkey(df2, id, role, grade)
setkey(df3, id, role, grade)
df_res <- rbind(
df1[df2[df3]],
df1[df3[df2]],
df2[df3[df1]],
df2[df1[df3]],
df3[df1[df2]],
df3[df2[df1]],
fill = T
)
unique(df_res)[order(id)]
> id role grade state.1 state.2 state.3
1: 1 A 3 1 NA 1
2: 1 A 4 NA 1 NA
3: 2 B 4 NA 1 1
4: 3 C 5 1 NA NA