在两列之间的 dplyr 中过滤,其中一列是列表或向量

filtering in dplyr between two columns where one is a list or vector

尝试通过比较 lob 列中存在 prod 的这两列来过滤此 df:

可重现代码:

df <- data.frame(prod = c("CES", "Access", "Access", "CES"), lob = c("Access;Entertainment", "CES", "Access", "Access;Entertainment;CES"))

    prod                      lob
1    CES     Access;Entertainment
2 Access                      CES
3 Access                   Access
4    CES Access;Entertainment;CES

预期结果:

    prod                      lob
1 Access                   Access
2    CES Access;Entertainment;CES

我试过将 lob 列拆分为向量或包含元素的列表,然后将 dplyr filtergrepl(prod, lob)prod %in% lob 一起使用,但似乎都不起作用

df %>%
filter(prod %in% lob)

df %>%
mutate(lob = strsplit(lob, ";")) %>%
filter(prod %in% lob)

df %>%
mutate(lob = strsplit(lob, ";")) %>%
filter(grepl(prod), lob)

可能最简单的方法就是在其中添加一个 rowwise()

df %>%
  mutate(lob = strsplit(lob, ";")) %>% 
  rowwise() %>% 
  filter(prod %in% lob) %>% 
  as.data.frame() # rowwise makes it a tibble, this changes it back if needed

如果你真的不想做mutate(),你可以做

df %>%
  rowwise() %>% 
  filter(prod %in% strsplit(lob, ";")[[1]])

stringr::str_detect

library(tidyverse)

df %>% 
  filter(str_detect(as.character(lob), as.character(prod)))