stringr：查找任何列内容与正则表达式匹配的行

Question

考虑以下示例

> data_text <- data.frame(text = c('where', 'are', 'you'),
                        blob = c('little', 'nice', 'text'))
> data_text
# A tibble: 3 x 2
   text   blob
  <chr>  <chr>
1 where little
2   are   nice
3   you   text

我想打印包含正则表达式 text 的行（即第 3 行）

问题是，我有数百列，但我不知道哪一个包含这个字符串。 str_detect 一次只能处理一列...

我如何使用 stringr 包做到这一点？谢谢！

Answer 1

使用 stringr 和 dplyr 你可以做到这一点。

您应该使用 dplyr >= 0.5.0 中的 filter_all。

我扩展了数据以更好地查看结果：

library(dplyr)
library(stringr)

data_text <- data.frame(text = c('text', 'where', 'are', 'you'),
                    one_more_text = c('test', 'test', 'test', 'test'),
                    blob = c('wow', 'little', 'nice', 'text'))

data_text %>%
  filter_all(any_vars(str_detect(., 'text')))

# output
  text one_more_text blob
1 text          test  wow
2  you          test text

Answer 2

您可以将 data.frame 视为一个列表并使用 purrr::map 检查每一列，然后可以将其 reduced 转换为 filter 可以使用的逻辑向量处理。或者，purrr::pmap 可以并行遍历所有列：

library(tidyverse)

data_text <- data_frame(text = c('where', 'are', 'you'),
                        blob = c('little', 'nice', 'text'))

data_text %>% filter(map(., ~.x == 'text') %>% reduce(`|`))
#> # A tibble: 1 x 2
#>    text  blob
#>   <chr> <chr>
#> 1   you  text

data_text %>% filter(pmap_lgl(., ~any(c(...) == 'text')))
#> # A tibble: 1 x 2
#>    text  blob
#>   <chr> <chr>
#> 1   you  text

Answer 3

matches = apply(data_text,1,function(x) sum(grepl("text",x)))>0
result = data_text[matches,]

不需要其他软件包。希望这对您有所帮助！

stringr：查找任何列内容与正则表达式匹配的行

stringr: find rows where any column content matches a regex

regex

r

stringr