正则表达式直到第一次出现括号关闭

Regex to till the first occurrence of the bracket close

我有一个名为 cars 的字符串,如下所示:

cars
[1] "Only one car(52;model-14557) had a good engine(workable condition), others engine were damaged beyond repair"   
[2] "Other car(21, model-155) looked in good condition but car ( 36, model-8878) looked to be in terrible condition."

我需要从字符串中提取以下部分:

car(52;model-14557)
car(21, model-155)
car ( 36, model-8878)

我尝试使用下面的罐子来提取它:

stringr::str_extract_all(cars, "(.car\s{0,5}\(([^]]+)\))")

这给了我以下输出:

[[1]]
[1] " car(52;model-14557) had a good engine(workable condition)"

[[2]]
[1] " car(21, model-155) looked in good condition but car ( 36, model-8878)"

有没有一种方法可以提取带有相关编号和型号的汽车一词?

Your regex does not work 因为您正在使用 [^]]+,一个或多个不同于 ] 的符号匹配 (),因此匹配来自第一个 ( 到最后一个 ),中间没有 ]

使用

> cars <- c("Only one car(52;model-14557) had a good engine(workable condition), others engine were damaged beyond repair","Other car(21, model-155) looked in good condition but car ( 36, model-8878) looked to be in terrible condition.")
> library(stringr)
> str_extract_all(cars, "\bcar\s*\([^()]+\)")
[[1]]
[1] "car(52;model-14557)"

[[2]]
[1] "car(21, model-155)"    "car ( 36, model-8878)"

正则表达式为 \bcar\s*\([^()]+\),请参阅 online regex demo here

它匹配:

  • \b - 单词边界
  • car - 文字字符序列
  • \s* - 0+ 个空格
  • \( - 文字 (
  • [^()]+ - ()
  • 以外的 1 个或多个字符
  • \) - 文字 ).

请注意,使用以下基本 R 代码,相同的正则表达式将产生相同的结果:

> regmatches(cars, gregexpr("\bcar\s*\([^()]+\)", cars))
[[1]]
[1] "car(52;model-14557)"

[[2]]
[1] "car(21, model-155)"    "car ( 36, model-8878)"