在 Perl 中使用多个反向引用
Using multiple backreference in Perl
我正在尝试在 Perl 中使用多个反向引用来匹配 5 种不同的模式,但除了第一个之外我没有找到任何匹配项。
我尝试了以下方法:
my $string = ">abc|XYUXYU|KIOKEIK_7XNCU Happy, not-happy apple banana X ORIG=Came from trees NBMR 12345 OZ=1213379 NG=popZ AZ=2 BU=1";
$string =~ m/>(abc)|(.*)|.*ORIG=(.*)[A-Z].*NG=(.*)\s(.*)\s/;
print "First match should be 'abc'. We got: \n";
print "Second match should be 'XYUXYU'. We got: \n";
print "Third match should be 'Came from trees'. We got: \n";
print "Fourth match should be 'popZ'. We got: \n";
print "Fifth match should be 'AZ=2'. We got: \n";
我想要作为输出:
First match should be 'abc'. We got: abc
Second match should be 'XYUXYU'. We got: XYUXYU
Third match should be 'Came from trees'. We got: Came from trees
Fourth match should be 'popZ'. We got: popZ
Fifth match should be 'AZ=2'. We got: AZ=2
知道如何在 Perl 上以正确的方式解决这个问题吗?
您必须通过在前面加上 \
来转义 |
,否则它们意味着交替(a|b
匹配 a
或 b
)。对于你的第三场比赛,你必须通过附加 ?
使量词 *
非贪婪。并且您需要稍微调整第三个捕获组之后的模式以匹配 space 至少一个大写字符(这里不完全清楚总体可能性是什么,因为您只是给出了一个没有更多细节的例子。它可能需要进一步调整。)
#!/usr/bin/perl
use strict;
use warnings;
my $string = ">abc|XYUXYU|KIOKEIK_7XNCU Happy, not-happy apple banana X ORIG=Came from trees NBMR 12345 OZ=1213379 NG=popZ AZ=2 BU=1";
$string =~ m/>(abc)\|(.*)\|.*ORIG=(.*?)\s[A-Z]+.*NG=(.*)\s(.*)\s/;
print "First match should be 'abc'. We got: \n";
print "Second match should be 'XYUXYU'. We got: \n";
print "Third match should be 'Came from trees'. We got: \n";
print "Fourth match should be 'popZ'. We got: \n";
print "Fifth match should be 'AZ=2'. We got: \n";
输出:
First match should be 'abc'. We got: abc
Second match should be 'XYUXYU'. We got: XYUXYU
Third match should be 'Came from trees'. We got: Came from trees
Fourth match should be 'popZ'. We got: popZ
Fifth match should be 'AZ=2'. We got: AZ=2
我正在尝试在 Perl 中使用多个反向引用来匹配 5 种不同的模式,但除了第一个之外我没有找到任何匹配项。
我尝试了以下方法:
my $string = ">abc|XYUXYU|KIOKEIK_7XNCU Happy, not-happy apple banana X ORIG=Came from trees NBMR 12345 OZ=1213379 NG=popZ AZ=2 BU=1";
$string =~ m/>(abc)|(.*)|.*ORIG=(.*)[A-Z].*NG=(.*)\s(.*)\s/;
print "First match should be 'abc'. We got: \n";
print "Second match should be 'XYUXYU'. We got: \n";
print "Third match should be 'Came from trees'. We got: \n";
print "Fourth match should be 'popZ'. We got: \n";
print "Fifth match should be 'AZ=2'. We got: \n";
我想要作为输出:
First match should be 'abc'. We got: abc
Second match should be 'XYUXYU'. We got: XYUXYU
Third match should be 'Came from trees'. We got: Came from trees
Fourth match should be 'popZ'. We got: popZ
Fifth match should be 'AZ=2'. We got: AZ=2
知道如何在 Perl 上以正确的方式解决这个问题吗?
您必须通过在前面加上 \
来转义 |
,否则它们意味着交替(a|b
匹配 a
或 b
)。对于你的第三场比赛,你必须通过附加 ?
使量词 *
非贪婪。并且您需要稍微调整第三个捕获组之后的模式以匹配 space 至少一个大写字符(这里不完全清楚总体可能性是什么,因为您只是给出了一个没有更多细节的例子。它可能需要进一步调整。)
#!/usr/bin/perl
use strict;
use warnings;
my $string = ">abc|XYUXYU|KIOKEIK_7XNCU Happy, not-happy apple banana X ORIG=Came from trees NBMR 12345 OZ=1213379 NG=popZ AZ=2 BU=1";
$string =~ m/>(abc)\|(.*)\|.*ORIG=(.*?)\s[A-Z]+.*NG=(.*)\s(.*)\s/;
print "First match should be 'abc'. We got: \n";
print "Second match should be 'XYUXYU'. We got: \n";
print "Third match should be 'Came from trees'. We got: \n";
print "Fourth match should be 'popZ'. We got: \n";
print "Fifth match should be 'AZ=2'. We got: \n";
输出:
First match should be 'abc'. We got: abc
Second match should be 'XYUXYU'. We got: XYUXYU
Third match should be 'Came from trees'. We got: Came from trees
Fourth match should be 'popZ'. We got: popZ
Fifth match should be 'AZ=2'. We got: AZ=2