使用通配符提取数据框的一部分

Question

假设您要构建一个数据框，该数据框是另一个数据框的一部分，例如 SQL 查询：

SELECT * from df WHERE columns_name is in ("a", "b", "c").

我假设 dplyr 包含此功能，但我没有在其中看到通配符选项。我需要的是过滤具有很长字符串值的行，这些行可以很容易地指定为包含 %something_a%、%something_b% 或 %something_c%。我敢打赌有一种简单的方法可以做到这一点 - 有人知道它是什么吗？

Answer 1

我会使用 grepl，如下例所示：

df <- data.frame(fruits = c("apple", "banana", "cherry"))
df %>% 
     filter(grepl("app", fruits))

grepl 使用正则表达式，您可以使用它们来检查字符串中的模式。

extracting part of a data frame with a wild card