如何定义一个匹配 4、3 或 2 个单词的正则表达式?
How to define a regex that matches 4, 3 or 2 word with that hierarchic order?
更准确地说,我需要一个正则表达式来匹配 3、2 或 1 个连续的大写单词和 2 到 5 位数字...并将每个单词和数字保存在捕获组中...例如:
BULDING ROBERT SMITH 362 ---> Should be matched and the following should
be valid: ="BULDING"; ="ROBERT"; ="SMITH"; ="362";
BULDING STEVENSON 7255 ---> Should be matched and the following should
be valid: ="BULDING"; ="STEVENSON"; ="7255";
BULDING 15 ---> Should be matched and the following should
be valid: ="BULDING"; ="15";
直到现在我想出了以下
([A-Z]+ )?([A-Z]+ )?([A-Z]+) \b(\d{2,5})\b
但不能满足我的需求,因为它还匹配紧跟在第一和第二个可选匹配项之后的“”...你能帮我弄到吗?
不要捕获 space。对可选的使用非捕获组:
(?:([A-Z]+) )?(?:([A-Z]+) )?([A-Z]+) \b(\d{2,5})\b
(?:...)
创建一个非捕获组,这是一个用于括号表达式的组,但不在匹配结果中创建组。
分两步完成:
#!/usr/bin/env perl
use strict;
use warnings;
use v5.10;
while (<DATA>) {
if (/\b((?:[A-Z]+\s+){1,3})\b(\d+)\b/) {
my @words = split ' ', ;
my $num = ;
say "Words = " . join ', ', @words;
say "Num = $num";
}
}
__DATA__
BULDING ROBERT SMITH 362
BULDING STEVENSON 7255
BULDING 15
输出:
Words = BULDING, ROBERT, SMITH
Num = 362
Words = BULDING, STEVENSON
Num = 7255
Words = BULDING
Num = 15
和:
([A-Z]+) ([A-Z]+)? ([A-Z]+)? (\d{2,5})
这个正则表达式可以满足。
我认为您可以改用 split()
函数;
use strict;
use 5.010;
my $str="BULDING ROBERT SMITH 362";
my @array = split(" ",$str);
my $num = pop(@array);
my ($str1,$str2,$str3) = @array;
say $str1;
say $str2 if $str2;
say $str3 if $str3;
更准确地说,我需要一个正则表达式来匹配 3、2 或 1 个连续的大写单词和 2 到 5 位数字...并将每个单词和数字保存在捕获组中...例如:
BULDING ROBERT SMITH 362 ---> Should be matched and the following should
be valid: ="BULDING"; ="ROBERT"; ="SMITH"; ="362";
BULDING STEVENSON 7255 ---> Should be matched and the following should
be valid: ="BULDING"; ="STEVENSON"; ="7255";
BULDING 15 ---> Should be matched and the following should
be valid: ="BULDING"; ="15";
直到现在我想出了以下
([A-Z]+ )?([A-Z]+ )?([A-Z]+) \b(\d{2,5})\b
但不能满足我的需求,因为它还匹配紧跟在第一和第二个可选匹配项之后的“”...你能帮我弄到吗?
不要捕获 space。对可选的使用非捕获组:
(?:([A-Z]+) )?(?:([A-Z]+) )?([A-Z]+) \b(\d{2,5})\b
(?:...)
创建一个非捕获组,这是一个用于括号表达式的组,但不在匹配结果中创建组。
分两步完成:
#!/usr/bin/env perl
use strict;
use warnings;
use v5.10;
while (<DATA>) {
if (/\b((?:[A-Z]+\s+){1,3})\b(\d+)\b/) {
my @words = split ' ', ;
my $num = ;
say "Words = " . join ', ', @words;
say "Num = $num";
}
}
__DATA__
BULDING ROBERT SMITH 362
BULDING STEVENSON 7255
BULDING 15
输出:
Words = BULDING, ROBERT, SMITH
Num = 362
Words = BULDING, STEVENSON
Num = 7255
Words = BULDING
Num = 15
和:
([A-Z]+) ([A-Z]+)? ([A-Z]+)? (\d{2,5})
这个正则表达式可以满足。
我认为您可以改用 split()
函数;
use strict;
use 5.010;
my $str="BULDING ROBERT SMITH 362";
my @array = split(" ",$str);
my $num = pop(@array);
my ($str1,$str2,$str3) = @array;
say $str1;
say $str2 if $str2;
say $str3 if $str3;