根据条件合并 R 中的行和值

Merge rows and values in R based on condition

我有以下示例数据:

# data
school = c('ABC University','ABC Uni','DFG University','DFG U')
applicant = c(2000,3100,210,2000)
students = c(100,2000,300,4000)
df = data.frame(school,applicant,students)

我想合并到这个:

|school        |appliant| students |
-----------------------------------
|ABC University| 5100   | 2100     |
|DFG University| 2210   | 4300     |

我运行这个代码:

df$school[df$school == 'ABC Uni'] = 'ABC University'

但它给了我两次 ABC 大学而不是将它们合并在一起。

这实际上取决于您的其余字符串,但您可以查看 grep 并使用 ^ 作为开头。

df[grep('^ABC U', df$school), 'school'] <- 'ABC University'
df[grep('^DFG U', df$school), 'school'] <- 'DFG University'

和往常一样aggregate

aggregate(cbind(applicant, students) ~ school, df, sum)
#           school applicant students
# 1 ABC University      5100     2100
# 2 DFG University      2210     4300

这里是dplyrstringr解决方案:

library(dplyr)
library(stringr)
df %>% 
    mutate(school = str_replace_all(school, c(
        "^ABC Uni$" = "ABC University",
        "^DFG U$" = "DFG University"))) %>% 
    group_by(school) %>% 
    summarise(across(c(applicant, students), sum))

输出:

  school         applicant students
  <chr>              <dbl>    <dbl>
1 ABC University      5100     2100
2 DFG University      2210     4300