否定前瞻检查是否缺少多个模式
Negative lookahead checking for absence of multiple patterns
我目前正在尝试根据以下数据中模式列中的正则表达式值是否存在于描述列中来索引应该保留和不应保留的行。
data <- data.frame(id = c(1,2,2,3,3,4),
old_levels = c(0,1,1,1,1,2),
levels = c(1,2,3,2,3,4),
description = c("vegetable", "fruit", "fruit",
"meat", "meat", "soda"),
pattern = c("vegetable",
"fruit",
"?!(vegetable|fruit)",
"fruit",
"?!(vegetable|fruit)",
NA))
使用 dplyr
我认为下面的例子应该可行:
data %>% rowwise() %>% mutate(matches = grepl(pattern, description))
然而,这会产生:
# A tibble: 6 x 6
# Rowwise:
id old_levels levels description pattern matches
<dbl> <dbl> <dbl> <chr> <chr> <lgl>
1 1 0 1 vegetable vegetable TRUE
2 2 1 2 fruit fruit TRUE
3 2 1 3 fruit ?!(vegetable|fruit) FALSE
4 3 1 2 meat fruit FALSE
5 3 1 3 meat ?!(vegetable|fruit) FALSE
6 4 2 4 soda NA NA
NA
是预期的并且按预期工作,但是我正在努力让负面前瞻工作,因为第 5 行中的 matches
应该是 TRUE...
如有任何帮助,我们将不胜感激!
先行语法是 (?!...)
,而不是 ?!(...)
。
此外,grepl
默认的TRE库不支持lookarounds,需要通过perl=TRUE
。
你可以使用
data <- data.frame(id = c(1,2,2,3,3,4),
old_levels = c(0,1,1,1,1,2),
levels = c(1,2,3,2,3,4),
description = c("vegetable", "fruit", "fruit",
"meat", "meat", "soda"),
pattern = c("vegetable",
"fruit",
"^(?!.*(?:vegetable|fruit))",
"fruit",
"^(?!.*(?:vegetable|fruit))",
NA))
data %>% rowwise() %>% mutate(matches = grepl(pattern, description, perl=TRUE))
输出:
> data %>% rowwise() %>% mutate(matches = grepl(pattern, description, perl=TRUE))
# A tibble: 6 x 6
# Rowwise:
id old_levels levels description pattern matches
<dbl> <dbl> <dbl> <chr> <chr> <lgl>
1 1 0 1 vegetable vegetable TRUE
2 2 1 2 fruit fruit TRUE
3 2 1 3 fruit ^(?!.*(?:vegetable|fruit)) FALSE
4 3 1 2 meat fruit FALSE
5 3 1 3 meat ^(?!.*(?:vegetable|fruit)) TRUE
6 4 2 4 soda <NA> NA
我目前正在尝试根据以下数据中模式列中的正则表达式值是否存在于描述列中来索引应该保留和不应保留的行。
data <- data.frame(id = c(1,2,2,3,3,4),
old_levels = c(0,1,1,1,1,2),
levels = c(1,2,3,2,3,4),
description = c("vegetable", "fruit", "fruit",
"meat", "meat", "soda"),
pattern = c("vegetable",
"fruit",
"?!(vegetable|fruit)",
"fruit",
"?!(vegetable|fruit)",
NA))
使用 dplyr
我认为下面的例子应该可行:
data %>% rowwise() %>% mutate(matches = grepl(pattern, description))
然而,这会产生:
# A tibble: 6 x 6
# Rowwise:
id old_levels levels description pattern matches
<dbl> <dbl> <dbl> <chr> <chr> <lgl>
1 1 0 1 vegetable vegetable TRUE
2 2 1 2 fruit fruit TRUE
3 2 1 3 fruit ?!(vegetable|fruit) FALSE
4 3 1 2 meat fruit FALSE
5 3 1 3 meat ?!(vegetable|fruit) FALSE
6 4 2 4 soda NA NA
NA
是预期的并且按预期工作,但是我正在努力让负面前瞻工作,因为第 5 行中的 matches
应该是 TRUE...
如有任何帮助,我们将不胜感激!
先行语法是 (?!...)
,而不是 ?!(...)
。
此外,grepl
默认的TRE库不支持lookarounds,需要通过perl=TRUE
。
你可以使用
data <- data.frame(id = c(1,2,2,3,3,4),
old_levels = c(0,1,1,1,1,2),
levels = c(1,2,3,2,3,4),
description = c("vegetable", "fruit", "fruit",
"meat", "meat", "soda"),
pattern = c("vegetable",
"fruit",
"^(?!.*(?:vegetable|fruit))",
"fruit",
"^(?!.*(?:vegetable|fruit))",
NA))
data %>% rowwise() %>% mutate(matches = grepl(pattern, description, perl=TRUE))
输出:
> data %>% rowwise() %>% mutate(matches = grepl(pattern, description, perl=TRUE))
# A tibble: 6 x 6
# Rowwise:
id old_levels levels description pattern matches
<dbl> <dbl> <dbl> <chr> <chr> <lgl>
1 1 0 1 vegetable vegetable TRUE
2 2 1 2 fruit fruit TRUE
3 2 1 3 fruit ^(?!.*(?:vegetable|fruit)) FALSE
4 3 1 2 meat fruit FALSE
5 3 1 3 meat ^(?!.*(?:vegetable|fruit)) TRUE
6 4 2 4 soda <NA> NA