如何定义一个匹配 4、3 或 2 个单词的正则表达式?

How to define a regex that matches 4, 3 or 2 word with that hierarchic order?

更准确地说,我需要一个正则表达式来匹配 3、2 或 1 个连续的大写单词和 2 到 5 位数字...并将每个单词和数字保存在捕获组中...例如:

BULDING ROBERT SMITH 362 ---> Should be matched and the following should 
                              be valid: ="BULDING"; ="ROBERT"; ="SMITH"; ="362";

BULDING STEVENSON 7255 ---> Should be matched and the following should 
                              be valid: ="BULDING"; ="STEVENSON"; ="7255";

BULDING 15 ---> Should be matched and the following should 
                              be valid: ="BULDING"; ="15";

直到现在我想出了以下

([A-Z]+ )?([A-Z]+ )?([A-Z]+) \b(\d{2,5})\b

但不能满足我的需求,因为它还匹配紧跟在第一和第二个可选匹配项之后的“”...你能帮我弄到吗?

不要捕获 space。对可选的使用非捕获组:

(?:([A-Z]+) )?(?:([A-Z]+) )?([A-Z]+) \b(\d{2,5})\b

(?:...)创建一个非捕获组,这是一个用于括号表达式的组,但不在匹配结果中创建组。

分两步完成:

#!/usr/bin/env perl

use strict;
use warnings;
use v5.10;

while (<DATA>) {
    if (/\b((?:[A-Z]+\s+){1,3})\b(\d+)\b/) {
        my @words = split ' ', ;
        my $num = ;

        say "Words = " . join ', ', @words;
        say "Num   = $num";
    }
}

__DATA__
BULDING ROBERT SMITH 362
BULDING STEVENSON 7255
BULDING 15

输出:

Words = BULDING, ROBERT, SMITH
Num   = 362
Words = BULDING, STEVENSON
Num   = 7255
Words = BULDING
Num   = 15

和:

 ([A-Z]+) ([A-Z]+)? ([A-Z]+)? (\d{2,5})

这个正则表达式可以满足。 我认为您可以改用 split() 函数;

use strict;
use 5.010;
my $str="BULDING ROBERT SMITH 362";
my @array = split(" ",$str);
my $num = pop(@array);
my ($str1,$str2,$str3) = @array;
say $str1;
say $str2 if $str2;
say $str3 if $str3;