否定前瞻检查是否缺少多个模式

Negative lookahead checking for absence of multiple patterns

我目前正在尝试根据以下数据中模式列中的正则表达式值是否存在于描述列中来索引应该保留和不应保留的行。

data <- data.frame(id = c(1,2,2,3,3,4), 
                   old_levels = c(0,1,1,1,1,2),
                   levels = c(1,2,3,2,3,4),
                   description = c("vegetable", "fruit", "fruit",
                                   "meat", "meat", "soda"),
                   pattern = c("vegetable",
                               "fruit", 
                               "?!(vegetable|fruit)", 
                               "fruit",
                               "?!(vegetable|fruit)", 
                               NA))

使用 dplyr 我认为下面的例子应该可行:

data %>% rowwise() %>% mutate(matches = grepl(pattern, description))

然而,这会产生:

# A tibble: 6 x 6
# Rowwise: 
     id old_levels levels description pattern             matches
  <dbl>      <dbl>  <dbl> <chr>       <chr>               <lgl>  
1     1          0      1 vegetable   vegetable           TRUE   
2     2          1      2 fruit       fruit               TRUE   
3     2          1      3 fruit       ?!(vegetable|fruit) FALSE  
4     3          1      2 meat        fruit               FALSE  
5     3          1      3 meat        ?!(vegetable|fruit) FALSE  
6     4          2      4 soda        NA                  NA        

NA 是预期的并且按预期工作,但是我正在努力让负面前瞻工作,因为第 5 行中的 matches 应该是 TRUE...

如有任何帮助,我们将不胜感激!

先行语法是 (?!...),而不是 ?!(...)

此外,grepl默认的TRE库不支持lookarounds,需要通过perl=TRUE

你可以使用

data <- data.frame(id = c(1,2,2,3,3,4), 
                   old_levels = c(0,1,1,1,1,2),
                   levels = c(1,2,3,2,3,4),
                   description = c("vegetable", "fruit", "fruit",
                                   "meat", "meat", "soda"),
                   pattern = c("vegetable",
                               "fruit", 
                               "^(?!.*(?:vegetable|fruit))", 
                               "fruit",
                               "^(?!.*(?:vegetable|fruit))", 
                               NA))

data %>% rowwise() %>% mutate(matches = grepl(pattern, description, perl=TRUE))

输出:

> data %>% rowwise() %>% mutate(matches = grepl(pattern, description, perl=TRUE))
# A tibble: 6 x 6
# Rowwise: 
     id old_levels levels description pattern                    matches
  <dbl>      <dbl>  <dbl> <chr>       <chr>                      <lgl>  
1     1          0      1 vegetable   vegetable                  TRUE   
2     2          1      2 fruit       fruit                      TRUE   
3     2          1      3 fruit       ^(?!.*(?:vegetable|fruit)) FALSE  
4     3          1      2 meat        fruit                      FALSE  
5     3          1      3 meat        ^(?!.*(?:vegetable|fruit)) TRUE   
6     4          2      4 soda        <NA>                       NA