在 R 中：将几乎重复的行合并为一行，并合并不同的元素

Question

我有这个 data.frame，它的形式如下：

my_df <- data.frame(id = c(1, 1, 2, 3), 
                 title = c("YourMa", "YourMa", "MyMa", "HisMa"), 
                autqty = c(2, 2, 1, 1), 
                   aut = c("Steve", "Joe", "Albert", "Kevin"), 
                  pubb = c("Good", "Good", "Meh", "Fan"))

看起来像：

> my_df
id  title   autqty aut    pubb
1   YourMa     2   Steve  Good
1   YourMa     2   Joe    Good
2   MyMa       1   Albert Meh
3   HisMa      1   Kevin  Fan

请注意，对于 id 1，除了一个 aut 条目外，所有信息都是相同的。 我的目标 是精简 my_df 以便 aut 数据合并为一个元素：

  id  title  autqty  aut         pubb
   1 YourMa    2     Steve, Joe  Good
   2 MyMa      1     Albert      Meh
   3 HisMa     1     Kevin       Fan

注意：这是我原始数据的缩小版本。我希望能够处理任意数量的 aut。

Answer 1

在 dplyr 中使用 group_by 和 summarise:

my_df %>% 
  group_by(id, title, autqty, pubb) %>%
  summarise(aut=paste(aut, collapse=", ")) %>%
  ungroup()

# A tibble: 3 × 5
     id  title autqty   pubb        aut
  <dbl> <fctr>  <dbl> <fctr>      <chr>
1     1 YourMa      2   Good Steve, Joe
2     2   MyMa      1    Meh     Albert
3     3  HisMa      1    Fan      Kevin

在 R 中：将几乎重复的行合并为一行，并合并不同的元素

in R: Combining rows that are almost duplicates into one row with differing element combined

r

dataframe

rbind