如何合并具有相同标识符 R 的行?
How to combine rows with the same identifier R?
我进行了很多搜索,但似乎找不到我正在寻找的答案。这些行最初融合在一起,然后我将它们展开,现在我有一个看起来与此类似的数据框:
这是输出:
structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L),
`first name` = c("Jamie", NA, NA, NA, NA, "sandra", NA, NA,
NA, NA), `last name` = c(NA, "Johns", NA, NA, NA, NA, NA,
"chan", NA, NA), q1_ans = c(NA, NA, "yes", NA, NA, NA, "yes",
NA, NA, NA), q2_ans = c(NA, NA, NA, "no", NA, NA, NA, NA,
"yes", NA), q3_ans = c(NA, NA, NA, NA, "yes", NA, NA, NA,
NA, "no")), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"), spec = structure(list(cols = list(ID = structure(list(), class = c("collector_integer",
"collector")), `first name` = structure(list(), class = c("collector_character",
"collector")), `last name` = structure(list(), class = c("collector_character",
"collector")), q1_ans = structure(list(), class = c("collector_character",
"collector")), q2_ans = structure(list(), class = c("collector_character",
"collector")), q3_ans = structure(list(), class = c("collector_character",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector"))), class = "col_spec"))
我拥有的真实数据框有更多的行和更多的列。我想将它们组合起来,以便 ID 1 的所有内容都在一行中,ID 2 的所有内容都在一行中,依此类推。我已经试过了,但它没有让我到任何地方
qr <- qr %>%
group_by(., ID) %>%
rowwise() %>%
summarise_all(funs(first(na.omit(.))))
我收到错误:
Error in summarise_impl(.data, dots) :
Column `first name` must be length 1 (a summary value), not 0
我也试过 dcast,但也没有用。谢谢!
我们不需要 rowwise
。按 'ID' 分组后,在 summarise_all
中使用 na.omit
(假设每个列 'ID' 中只有一个非 NA 元素
qr %>%
group_by(ID) %>%
summarise_all(na.omit)
# A tibble: 2 x 6
# ID `first name` `last name` q1_ans q2_ans q3_ans
# <int> <chr> <chr> <chr> <chr> <chr>
#1 1 Jamie Johns yes no yes
#2 2 sandra chan yes yes no
如果每个 'ID' 列有多个非 NA 元素,则通过连接所有非 NA 元素创建一个字符串
qr %>%
group_by(ID) %>%
summarise_all(funs(toString(na.omit(.))))
或创建一个 list
然后执行 unnest
qr %>%
group_by(ID) %>%
summarise_all(funs(list(na.omit(.))))
我进行了很多搜索,但似乎找不到我正在寻找的答案。这些行最初融合在一起,然后我将它们展开,现在我有一个看起来与此类似的数据框:
这是输出:
structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L),
`first name` = c("Jamie", NA, NA, NA, NA, "sandra", NA, NA,
NA, NA), `last name` = c(NA, "Johns", NA, NA, NA, NA, NA,
"chan", NA, NA), q1_ans = c(NA, NA, "yes", NA, NA, NA, "yes",
NA, NA, NA), q2_ans = c(NA, NA, NA, "no", NA, NA, NA, NA,
"yes", NA), q3_ans = c(NA, NA, NA, NA, "yes", NA, NA, NA,
NA, "no")), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"), spec = structure(list(cols = list(ID = structure(list(), class = c("collector_integer",
"collector")), `first name` = structure(list(), class = c("collector_character",
"collector")), `last name` = structure(list(), class = c("collector_character",
"collector")), q1_ans = structure(list(), class = c("collector_character",
"collector")), q2_ans = structure(list(), class = c("collector_character",
"collector")), q3_ans = structure(list(), class = c("collector_character",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector"))), class = "col_spec"))
我拥有的真实数据框有更多的行和更多的列。我想将它们组合起来,以便 ID 1 的所有内容都在一行中,ID 2 的所有内容都在一行中,依此类推。我已经试过了,但它没有让我到任何地方
qr <- qr %>%
group_by(., ID) %>%
rowwise() %>%
summarise_all(funs(first(na.omit(.))))
我收到错误:
Error in summarise_impl(.data, dots) :
Column `first name` must be length 1 (a summary value), not 0
我也试过 dcast,但也没有用。谢谢!
我们不需要 rowwise
。按 'ID' 分组后,在 summarise_all
中使用 na.omit
(假设每个列 'ID' 中只有一个非 NA 元素
qr %>%
group_by(ID) %>%
summarise_all(na.omit)
# A tibble: 2 x 6
# ID `first name` `last name` q1_ans q2_ans q3_ans
# <int> <chr> <chr> <chr> <chr> <chr>
#1 1 Jamie Johns yes no yes
#2 2 sandra chan yes yes no
如果每个 'ID' 列有多个非 NA 元素,则通过连接所有非 NA 元素创建一个字符串
qr %>%
group_by(ID) %>%
summarise_all(funs(toString(na.omit(.))))
或创建一个 list
然后执行 unnest
qr %>%
group_by(ID) %>%
summarise_all(funs(list(na.omit(.))))