在两列之间的 dplyr 中过滤，其中一列是列表或向量

Question

尝试通过比较 lob 列中存在 prod 的这两列来过滤此 df:

可重现代码：

df <- data.frame(prod = c("CES", "Access", "Access", "CES"), lob = c("Access;Entertainment", "CES", "Access", "Access;Entertainment;CES"))

    prod                      lob
1    CES     Access;Entertainment
2 Access                      CES
3 Access                   Access
4    CES Access;Entertainment;CES

预期结果：

    prod                      lob
1 Access                   Access
2    CES Access;Entertainment;CES

我试过将 lob 列拆分为向量或包含元素的列表，然后将 dplyr filter 与 grepl(prod, lob) 或 prod %in% lob 一起使用，但似乎都不起作用

df %>%
filter(prod %in% lob)

df %>%
mutate(lob = strsplit(lob, ";")) %>%
filter(prod %in% lob)

df %>%
mutate(lob = strsplit(lob, ";")) %>%
filter(grepl(prod), lob)

Answer 1

可能最简单的方法就是在其中添加一个 rowwise()

df %>%
  mutate(lob = strsplit(lob, ";")) %>% 
  rowwise() %>% 
  filter(prod %in% lob) %>% 
  as.data.frame() # rowwise makes it a tibble, this changes it back if needed

如果你真的不想做mutate()，你可以做

df %>%
  rowwise() %>% 
  filter(prod %in% strsplit(lob, ";")[[1]])

Answer 2

和stringr::str_detect

library(tidyverse)

df %>% 
  filter(str_detect(as.character(lob), as.character(prod)))

在两列之间的 dplyr 中过滤，其中一列是列表或向量

filtering in dplyr between two columns where one is a list or vector

r

strsplit

dplyr