用于多模式分组和多行正则表达式的 Perl 正则表达式

Perl regex for multiple pattern grouping and multiline regex

我有一个包含上述格式的多行的输入 txt 文件。

JMOD_01 :: This is starting of grouping 2nd KFGJHFG RTIRT DFB SFJKF ERIEFF FJDKF OIOIISD SDJKD 
last line ______________ 5564 numerical digits.

This is second starting of grouping 2nd KFGJHFG RTIRT FSFJKF  
ERIEFF FJDKF OIOIISD SDJKD 
till this end ___________ 021542 some random digits.

我正在尝试读取此文件并以分组方式提取搜索到的模式

下面是我试过的。 我试过,对第一场比赛进行分组,它被正确捕获。 寻找第二个分组时出现问题,它没有考虑下一行元素。

open(IFH,'<',"file.txt");

while ($line = <IFH>) {
if ($line =~ /^\s*(\w+\_\d*.*)\s*::(.*)/s) {
print "\n";
print "\n";
}
}
close(IFH);

预期结果:

打印$1; #这应该给我

JMOD_01
fdgh_6765_546/456

当 时,打印 $2; #然后它应该给我

"This is starting of grouping 2nd KFGJHFG RTIRT DFB SFJKF ERIEFF FJDKF OIOIISD SDJKD last line"

"This is second starting of grouping 2nd KFGJHFG RTIRT FSFJKF  
ERIEFF FJDKF OIOIISD SDJKD till this end"

然后,打印 $3; #然后它应该给

"5564 numerical digits"
"021542 some random digits"

但是第二组的实际输出有所不同: 打印 2 美元; #实际输出

"This is first starting of grouping 2nd KFGJHFG RTIRT DFB SFJKF"

"This is second starting of grouping 2nd KFGJHFG RTIRT FSFJKF"

如果我正确理解了这个问题,我们很可能可以使用两个简单的表达式并提取我们想要的数据,如果这样的话:

([A-Z_0-9]+)\s+::\s+([\s\S]+)

Demo 1

测试

use strict;

my $str = 'JMOD_01 :: This is starting of grouping 2nd KFGJHFG RTIRT DFB SFJKF ERIEFF FJDKF OIOIISD SDJKD 
last line ______________ 5564 numerical digits.

This is second starting of grouping 2nd KFGJHFG RTIRT FSFJKF  
ERIEFF FJDKF OIOIISD SDJKD 
till this end ___________ 021542 some random digits.

';
my $regex = qr/([A-Z_0-9]+)\s+::\s+([\s\S]+)/mp;

if ( $str =~ /$regex/g ) {
  print "Whole match is ${^MATCH} and its start/end positions can be obtained via $-[0] and $+[0]\n";
  # print "Capture Group 1 is  and its start/end positions can be obtained via $-[1] and $+[1]\n";
  # print "Capture Group 2 is  ... and so on\n";
}

# ${^POSTMATCH} and ${^PREMATCH} are also available with the use of '/p'
# Named capture groups can be called via $+{name}

并提取我们的数字:

([0-9]+\snumerical digits|[0-9]+\ssome random digits)

Demo 2

测试

use strict;

my $str = 'JMOD_01 :: This is starting of grouping 2nd KFGJHFG RTIRT DFB SFJKF ERIEFF FJDKF OIOIISD SDJKD 
last line ______________ 5564 numerical digits.

This is second starting of grouping 2nd KFGJHFG RTIRT FSFJKF  
ERIEFF FJDKF OIOIISD SDJKD 
till this end ___________ 021542 some random digits.

';
my $regex = qr/([0-9]+\snumerical digits|[0-9]+\ssome random digits)/mp;

if ( $str =~ /$regex/g ) {
  print "Whole match is ${^MATCH} and its start/end positions can be obtained via $-[0] and $+[0]\n";
  # print "Capture Group 1 is  and its start/end positions can be obtained via $-[1] and $+[1]\n";
  # print "Capture Group 2 is  ... and so on\n";
}

# ${^POSTMATCH} and ${^PREMATCH} are also available with the use of '/p'
# Named capture groups can be called via $+{name}

正则表达式电路

jex.im 可视化正则表达式: