由于命名捕获组中的 'unclosed character class',正则表达式无法编译

Regex won't compile because of 'unclosed character class' in named capture group

我在 Rust 正则表达式中收到“错误:未闭合字符 class”。使用符合 PCRE 的正则表达式的在线正则表达式测试器测试正则表达式工作正常,但在 Rust Playground 上使用 regex crate 会出错。

字符class必须包含减号。我试着把减号放在第一个位置,最后一个位置,然后完全不加,但总是出错。

对于大多数预期输入,对于某些操作和一些非负整数,源字符串将是“op(number)”。对于少数人,我期待“op(number/number/number)”。

如果有更好的方法来提取命名捕获,我会洗耳恭听。

use lazy_static::lazy_static;
use regex::Regex;

fn main() {
    lazy_static! {
        static ref FANCY_OPCODE_RE: Regex = Regex::new(r"(?x)
            ^                              # Match start of string
            (?P<opname>[-a-zA-Z#+]+)       # Match abbreviated name of OpCode as 'opname'
            \(                             # Open parentheses
            (?P<arg1>[0-9]+)               # Match first number as 'arg1'
            (/                             # Delimiter
            (?P<arg2>[0-9]+)               # Optionally match second number as 'arg2'
            /                              # Delimiter
            (?P<arg3>[0-9]+))?             # Optionally match third number as 'arg3'
            \)                             # Closing parenthesis
            $                              # Match end of string
        ").unwrap();
    }
    let s = "+loop(3)";
    let opname: String; 
    let arg1: String;
    let arg2: String;
    let arg3: String;
    match FANCY_OPCODE_RE.captures(s) {
        Some(cap) => { 
            opname = format!("{:?}", cap.name("opname")); 
            arg1 = format!("{:?}", cap.name("arg1"));
            arg2 = format!("{:?}", cap.name("arg2"));
            arg3 = format!("{:?}", cap.name("arg3"));
        },
        None => { 
            opname = "No match".to_string(); 
            arg1 = String::new();
            arg2 = String::new();
            arg3 = String::new();
        }
    }

    println!("opname = {}, arg1 = {}, arg2 = {}, arg3 = {}", opname, arg1, arg2, arg3);
}

错误信息如下:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Syntax(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
regex parse error:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1: (?x)
 2:             ^                              # Match start of string
 3:             (?P<opname>[-a-zA-Z#+]+)       # Match abbreviated name of OpCode as 'opname'
                           ^^
 4:             \(                             # Open parentheses
 5:             (?P<arg1>[0-9]+)               # Match first number as 'arg1'
 6:             (/                             # Delimiter
 7:             (?P<arg2>[0-9]+)               # Optionally match second number as 'arg2'
 8:             /                              # Delimiter
 9:             (?P<arg3>[0-9]+))?             # Optionally match third number as 'arg3'
10:             \)                             # Closing parenthesis
11:             $                              # Match end of string
12:         
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
error: unclosed character class
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
)', src/main.rs:17:12

调试问题时,创建一个 minimal, reproducible example 很有用。通过删除不会导致问题的正则表达式部分,您可以快速减少到:

Regex::new(r"(?x)(?P<opname>[-a-zA-Z#+]+)").unwrap();

问题是您在 正则表达式中包含了注释字符 # 。逃脱它:

[-a-zA-Z\#+]