使用 stringr 的 str_detect() 过滤字符向量的行

Question

我正在尝试使用 dplyr::filter()、stringr:: str_detect 和 magrittr 管道对字符列 a 进行子集化，使用正则表达式捕获两个或更多数字。

这似乎只适用于数字列，并且仅在使用 $- 运算符直接访问该列时才有效：

library(tidyverse)

# Create example data: 
test_num <- tibble(
  a = c(1:3, 22:24))
test_num
#> # A tibble: 6 x 1
#>       a
#>   <int>
#> 1     1
#> 2     2
#> 3     3
#> 4    22
#> 5    23
#> 6    24

test_char <- tibble(
  a = as.character(c(1:3, 22:24)))
test_char 
#> # A tibble: 6 x 1
#>   a    
#>   <chr>
#> 1 1    
#> 2 2    
#> 3 3    
#> 4 22   
#> 5 23   
#> 6 24

# Subsetting numerical columns works:
test_num %>% 
  dplyr::filter(a, stringr::str_detect(a, "\d{2,}"))
#> # A tibble: 3 x 1
#>       a
#>   <int>
#> 1    22
#> 2    23
#> 3    24

# Subsetting a character columns does not work:
test_char %>% 
  dplyr::filter(a, stringr::str_detect(a, "\d{2,}"))
#> Error in filter_impl(.data, quo): Evaluation error: operations are possible only for numeric, logical or complex types.

# Wheras subsetting by accessing the column
# using the `$` operator works:
test_char$a %>% 
  stringr::str_detect("\d{2,}")
#> [1] FALSE FALSE FALSE  TRUE  TRUE  TRUE

test_num$a %>% 
  stringr::str_detect("\d{2,}")
#> [1] FALSE FALSE FALSE  TRUE  TRUE  TRUE

关于问题可能是什么以及如何使用 filter() 方法解决这个问题的任何想法？非常感谢您的提前帮助！

Answer 1

只需删除过滤器调用中的第一个 a。

而不是：

test_char %>%
  filter(a, str_detect(a, "2"))

使用：

test_char %>%
  filter(str_detect(a, "2"))

应该可以。

过滤器函数中的第一个也是唯一一个参数应该是 str_detect(col, "string")。

希望对您有所帮助！

使用 stringr 的 str_detect() 过滤字符向量的行

Filtering rows of a character vector using stringr's str_detect()

r

character-encoding

stringr

dplyr

magrittr