使用 R 中的 dplyr 查找一列字符串在另一列中的行

Question

希望拉回其中一列中的值作为另一列（在同一行中）中的字符串存在的行。

我有一个 df:

A <- c("cat", "dog", "boy")
B <- c("cat in the cradle", "meet the parents", "boy mmets world")

df <- as.data.frame(A, B)

A       B
cat     cat in the cradle
dog     meet the parents
boy     boy meets world

我正在尝试这样的事情：

df2 <- df %>%
          filter(grepl(A, B)) # doesn't work because it thinks A is the whole column vector

df2 <- df %>%
          filter(B %in% A) # which doesn't work because it has to be exact

我要它生产

A       B
cat     cat in the cradle
boy     boy meets world

提前致谢！

马特

Answer 1

您可以使用 Map 将函数应用于两个向量，或者使用 sapply

遍历行

df %>%
  filter(unlist(Map(function(x, y) grepl(x, y), A, B)))
    A                 B
1 cat cat in the cradle
2 boy   boy mmets world

df %>%
  filter(sapply(1:nrow(.), function(i) grepl(A[i], B[i])))
    A                 B
1 cat cat in the cradle
2 boy   boy mmets world

Answer 2

我们可以用 Map

df[mapply(grepl, df$A, df$B),]
#    A                 B
#1 cat cat in the cradle
#3 boy   boy mmets world

更新

使用 tidyverse，类似的选项是 purrr::map2 和 stringr::str_detect

library(tidyverse)
df %>% 
   filter(map2_lgl(B, A,  str_detect))
#     A                 B
#1 cat cat in the cradle
#2 boy   boy mmets world

数据

df <- data.frame(A, B, stringsAsFactors=FALSE)

Answer 3

为了完整起见，这可以使用 dplyr tidyverse

中的 str_detect 轻松完成

library(tidyverse)

df <- tibble(A, B) %>%
      filter(str_detect(B, fixed(A)) == TRUE)

df
# A tibble: 2 x 2
#   A     B                
#  <chr> <chr>            
#1 cat   cat in the cradle
#2 boy   boy mmets world

使用 R 中的 dplyr 查找一列字符串在另一列中的行

Find rows where one column string is in another column using dplyr in R

regex

r

dplyr

grepl

更新

数据