由于命名捕获组中的 'unclosed character class',正则表达式无法编译
Regex won't compile because of 'unclosed character class' in named capture group
我在 Rust 正则表达式中收到“错误:未闭合字符 class”。使用符合 PCRE 的正则表达式的在线正则表达式测试器测试正则表达式工作正常,但在 Rust Playground 上使用 regex crate 会出错。
字符class必须包含减号。我试着把减号放在第一个位置,最后一个位置,然后完全不加,但总是出错。
对于大多数预期输入,对于某些操作和一些非负整数,源字符串将是“op(number)”。对于少数人,我期待“op(number/number/number)”。
如果有更好的方法来提取命名捕获,我会洗耳恭听。
use lazy_static::lazy_static;
use regex::Regex;
fn main() {
lazy_static! {
static ref FANCY_OPCODE_RE: Regex = Regex::new(r"(?x)
^ # Match start of string
(?P<opname>[-a-zA-Z#+]+) # Match abbreviated name of OpCode as 'opname'
\( # Open parentheses
(?P<arg1>[0-9]+) # Match first number as 'arg1'
(/ # Delimiter
(?P<arg2>[0-9]+) # Optionally match second number as 'arg2'
/ # Delimiter
(?P<arg3>[0-9]+))? # Optionally match third number as 'arg3'
\) # Closing parenthesis
$ # Match end of string
").unwrap();
}
let s = "+loop(3)";
let opname: String;
let arg1: String;
let arg2: String;
let arg3: String;
match FANCY_OPCODE_RE.captures(s) {
Some(cap) => {
opname = format!("{:?}", cap.name("opname"));
arg1 = format!("{:?}", cap.name("arg1"));
arg2 = format!("{:?}", cap.name("arg2"));
arg3 = format!("{:?}", cap.name("arg3"));
},
None => {
opname = "No match".to_string();
arg1 = String::new();
arg2 = String::new();
arg3 = String::new();
}
}
println!("opname = {}, arg1 = {}, arg2 = {}, arg3 = {}", opname, arg1, arg2, arg3);
}
错误信息如下:
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Syntax(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
regex parse error:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1: (?x)
2: ^ # Match start of string
3: (?P<opname>[-a-zA-Z#+]+) # Match abbreviated name of OpCode as 'opname'
^^
4: \( # Open parentheses
5: (?P<arg1>[0-9]+) # Match first number as 'arg1'
6: (/ # Delimiter
7: (?P<arg2>[0-9]+) # Optionally match second number as 'arg2'
8: / # Delimiter
9: (?P<arg3>[0-9]+))? # Optionally match third number as 'arg3'
10: \) # Closing parenthesis
11: $ # Match end of string
12:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
error: unclosed character class
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
)', src/main.rs:17:12
调试问题时,创建一个 minimal, reproducible example 很有用。通过删除不会导致问题的正则表达式部分,您可以快速减少到:
Regex::new(r"(?x)(?P<opname>[-a-zA-Z#+]+)").unwrap();
问题是您在 正则表达式中包含了注释字符 #
。逃脱它:
[-a-zA-Z\#+]
我在 Rust 正则表达式中收到“错误:未闭合字符 class”。使用符合 PCRE 的正则表达式的在线正则表达式测试器测试正则表达式工作正常,但在 Rust Playground 上使用 regex crate 会出错。
字符class必须包含减号。我试着把减号放在第一个位置,最后一个位置,然后完全不加,但总是出错。
对于大多数预期输入,对于某些操作和一些非负整数,源字符串将是“op(number)”。对于少数人,我期待“op(number/number/number)”。
如果有更好的方法来提取命名捕获,我会洗耳恭听。
use lazy_static::lazy_static;
use regex::Regex;
fn main() {
lazy_static! {
static ref FANCY_OPCODE_RE: Regex = Regex::new(r"(?x)
^ # Match start of string
(?P<opname>[-a-zA-Z#+]+) # Match abbreviated name of OpCode as 'opname'
\( # Open parentheses
(?P<arg1>[0-9]+) # Match first number as 'arg1'
(/ # Delimiter
(?P<arg2>[0-9]+) # Optionally match second number as 'arg2'
/ # Delimiter
(?P<arg3>[0-9]+))? # Optionally match third number as 'arg3'
\) # Closing parenthesis
$ # Match end of string
").unwrap();
}
let s = "+loop(3)";
let opname: String;
let arg1: String;
let arg2: String;
let arg3: String;
match FANCY_OPCODE_RE.captures(s) {
Some(cap) => {
opname = format!("{:?}", cap.name("opname"));
arg1 = format!("{:?}", cap.name("arg1"));
arg2 = format!("{:?}", cap.name("arg2"));
arg3 = format!("{:?}", cap.name("arg3"));
},
None => {
opname = "No match".to_string();
arg1 = String::new();
arg2 = String::new();
arg3 = String::new();
}
}
println!("opname = {}, arg1 = {}, arg2 = {}, arg3 = {}", opname, arg1, arg2, arg3);
}
错误信息如下:
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Syntax(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
regex parse error:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1: (?x)
2: ^ # Match start of string
3: (?P<opname>[-a-zA-Z#+]+) # Match abbreviated name of OpCode as 'opname'
^^
4: \( # Open parentheses
5: (?P<arg1>[0-9]+) # Match first number as 'arg1'
6: (/ # Delimiter
7: (?P<arg2>[0-9]+) # Optionally match second number as 'arg2'
8: / # Delimiter
9: (?P<arg3>[0-9]+))? # Optionally match third number as 'arg3'
10: \) # Closing parenthesis
11: $ # Match end of string
12:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
error: unclosed character class
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
)', src/main.rs:17:12
调试问题时,创建一个 minimal, reproducible example 很有用。通过删除不会导致问题的正则表达式部分,您可以快速减少到:
Regex::new(r"(?x)(?P<opname>[-a-zA-Z#+]+)").unwrap();
问题是您在 正则表达式中包含了注释字符 #
。逃脱它:
[-a-zA-Z\#+]