用户 grpl 在数据框列的每个元素上查找不同数据框中的字符串

User grpl on each element of a dataframe column to find a string in a different data frame

我需要查找数据框列中的元素是否存在于另一个数据框列中,以便检索计数和总数。

例子

数据框 1

Details<-data.frame(FirstName=c("Carlos SM","Carlos JOH","Carlos WIL","Carlos JON","Carlos BR","Peter D","Peter MILL","Peter WILS","Peter MOO","Homer T"),Points=c("3","4","7","6","4","9","1","2","1","9"))

数据框 2

Results <- data.frame(Person=c("Carlos","Homer","Peter"))

理想的输出将向名为 Results 的数据框添加两列,一列用于计算在 Details 数据框中找到每个字符串的次数,另一列用于计算总点数。像这样

FirstName  Appearances  Total Points
Carlos          5             24
Perter          4             13
Homer           2             13

这应该可以解决问题

Results$Appearances=sapply(Results$Person,function(x) sum(grepl(x,Details$FirstName)))
Results$`Total Points`=sapply(Results$Person,function(x) sum(grepl(x,Details$FirstName)*as.numeric(Details$Points)))
Results
  Person Appearances Total Points
1 Carlos           5           22
2  Homer           1            7
3  Peter           4           11

此外,您预期输出中的数字似乎有点偏差。这真的很混乱。这只是你的错误,还是你想要一些不明显的字符匹配方式来产生那种结果?

使用 tidyrdplyr:

library(tidyr)
library(dplyr)
Details %>% separate(FirstName, c("Person", "last"), " ") %>%
            group_by(Person) %>%
            summarise(Appearances = n(), 
                      "Total Points" = sum(Points)) %>%
            left_join(Results, .)