在两列之间的 dplyr 中过滤,其中一列是列表或向量
filtering in dplyr between two columns where one is a list or vector
尝试通过比较 lob
列中存在 prod
的这两列来过滤此 df:
可重现代码:
df <- data.frame(prod = c("CES", "Access", "Access", "CES"), lob = c("Access;Entertainment", "CES", "Access", "Access;Entertainment;CES"))
prod lob
1 CES Access;Entertainment
2 Access CES
3 Access Access
4 CES Access;Entertainment;CES
预期结果:
prod lob
1 Access Access
2 CES Access;Entertainment;CES
我试过将 lob 列拆分为向量或包含元素的列表,然后将 dplyr filter
与 grepl(prod, lob)
或 prod %in% lob
一起使用,但似乎都不起作用
df %>%
filter(prod %in% lob)
df %>%
mutate(lob = strsplit(lob, ";")) %>%
filter(prod %in% lob)
df %>%
mutate(lob = strsplit(lob, ";")) %>%
filter(grepl(prod), lob)
可能最简单的方法就是在其中添加一个 rowwise()
df %>%
mutate(lob = strsplit(lob, ";")) %>%
rowwise() %>%
filter(prod %in% lob) %>%
as.data.frame() # rowwise makes it a tibble, this changes it back if needed
如果你真的不想做mutate()
,你可以做
df %>%
rowwise() %>%
filter(prod %in% strsplit(lob, ";")[[1]])
和stringr::str_detect
library(tidyverse)
df %>%
filter(str_detect(as.character(lob), as.character(prod)))
尝试通过比较 lob
列中存在 prod
的这两列来过滤此 df:
可重现代码:
df <- data.frame(prod = c("CES", "Access", "Access", "CES"), lob = c("Access;Entertainment", "CES", "Access", "Access;Entertainment;CES"))
prod lob
1 CES Access;Entertainment
2 Access CES
3 Access Access
4 CES Access;Entertainment;CES
预期结果:
prod lob
1 Access Access
2 CES Access;Entertainment;CES
我试过将 lob 列拆分为向量或包含元素的列表,然后将 dplyr filter
与 grepl(prod, lob)
或 prod %in% lob
一起使用,但似乎都不起作用
df %>%
filter(prod %in% lob)
df %>%
mutate(lob = strsplit(lob, ";")) %>%
filter(prod %in% lob)
df %>%
mutate(lob = strsplit(lob, ";")) %>%
filter(grepl(prod), lob)
可能最简单的方法就是在其中添加一个 rowwise()
df %>%
mutate(lob = strsplit(lob, ";")) %>%
rowwise() %>%
filter(prod %in% lob) %>%
as.data.frame() # rowwise makes it a tibble, this changes it back if needed
如果你真的不想做mutate()
,你可以做
df %>%
rowwise() %>%
filter(prod %in% strsplit(lob, ";")[[1]])
和stringr::str_detect
library(tidyverse)
df %>%
filter(str_detect(as.character(lob), as.character(prod)))