Perl Regex - 打印匹配的条件正则表达式
Perl Regex - Print the matched Conditional Regex
我正在尝试从日志文件中提取一些模式,但无法正确打印它们。
日志字符串示例:
1) sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.2644?startid=2644000&endid=2644666
2) sequence_history/buckets/FPJ.INV_DOM_16_PRD.41987.9616
我想提取 3 个东西:
A = "FPJ.INV_DOM_16_PRD" B = "47269" C = 9616 or 2644666 (if the line
has endid then C = 2644666 else it's 9616)
日志行可以是类型 1 或类型 2。我能够提取 A 和 B,但我被 C 困住了,因为我需要它的条件语句,但我无法正确提取它。我正在粘贴我的代码:
my $string='/sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.2644?startid=2644000&endid=2644666';
if ($string =~ /sequence_history\/buckets\/(.*)/){
my $line = ;
print "$line\n";
if($line =~ /(FPJ.*PRD)\.(\d*)\./){
my $topic_type_string = ;
my $topic_id = ;
print "\n\n";
}
if($string =~ /(?(?=endid=)\d*$)/){
# how to print match pattern here?
print "match\n";
}
提前致谢!
这将完成工作:
use Modern::Perl;
use Data::Dumper;
my $re = qr/(FPJ.+?PRD)\.(\d+)\..*?(\d+)$/;
while(<DATA>) {
chomp;
my (@l) = $_ =~ /$re/g;
say Dumper\@l;
}
__DATA__
sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.2644?startid=2644000&endid=2644666
sequence_history/buckets/FPJ.INV_DOM_16_PRD.41987.9616
输出:
$VAR1 = [
'FPJ.INV_DOM_16_PRD',
'47269',
'2644666'
];
$VAR1 = [
'FPJ.INV_DOM_16_PRD',
'41987',
'9616'
];
解释:
( : start group 1
FPJ : literally FPJ
.+? : 1 or more any character but newline, not greedy
PRD : literally PRD
) : end group 1
\. : a dot
( : start group 2
\d+ : 1 or more digit
) : end group 2
\. : a dot
.*? : 0 or more any character not greedy
( : start group 3
\d+ : 1 or more digit
) : end group 3
$ : end of string
如果您试图获取日志文件中的一些条目,那么您可以在 perl 中使用文件句柄。在下面的代码中,我试图从名为 test.log
的日志文件中获取条目
日志条目如下。
sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.2644?startid=2644000&endid=2644666
sequence_history/buckets/FPJ.INV_DOM_16_PRD.41987.9616
sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.69886?startid=2644000&endid=26765849
sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.24465?startid=2644000&endid=836783741
下面是获取所需数据的 perl 脚本。
#!/usr/bin/perl
use strict;
use warnings;
open (FH, "test.log") || die "Not able to open test.log $!";
my ($a,$b,$c);
while (my $line=<FH>)
{
if ($line =~ /sequence_history\/buckets\/.*endid=(\d*)/)
{
$c= ;
if ($line =~ /(FPJ.*PRD)\.(\d*)\.(\d*)\?/)
{
$a=;
$b=;
}
}
else
{
if ($line =~ /sequence_history\/buckets\/(FPJ.*PRD)\.(\d*)\.(\d*)/)
{
$a=;
$b=;
$c=;
}
}
print "\n $a=$a\n $b=$b\n $c=$c \n";
}
输出:
$a=FPJ.INV_DOM_16_PRD
$b=47269
$c=2644666
$a=FPJ.INV_DOM_16_PRD
$b=41987
$c=9616
$a=FPJ.INV_DOM_16_PRD
$b=47269
$c=26765849
$a=FPJ.INV_DOM_16_PRD
$b=47269
$c=836783741
您可以通过将 "test.log" 替换为您要从中获取数据的日志文件名(及其路径)来使用上述代码,如下所示。
open (FH, "/path/to/log/file/test.log") || die "Not able to open test.log $!";
我正在尝试从日志文件中提取一些模式,但无法正确打印它们。
日志字符串示例:
1) sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.2644?startid=2644000&endid=2644666
2) sequence_history/buckets/FPJ.INV_DOM_16_PRD.41987.9616
我想提取 3 个东西:
A = "FPJ.INV_DOM_16_PRD" B = "47269" C = 9616 or 2644666 (if the line has endid then C = 2644666 else it's 9616)
日志行可以是类型 1 或类型 2。我能够提取 A 和 B,但我被 C 困住了,因为我需要它的条件语句,但我无法正确提取它。我正在粘贴我的代码:
my $string='/sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.2644?startid=2644000&endid=2644666';
if ($string =~ /sequence_history\/buckets\/(.*)/){
my $line = ;
print "$line\n";
if($line =~ /(FPJ.*PRD)\.(\d*)\./){
my $topic_type_string = ;
my $topic_id = ;
print "\n\n";
}
if($string =~ /(?(?=endid=)\d*$)/){
# how to print match pattern here?
print "match\n";
}
提前致谢!
这将完成工作:
use Modern::Perl;
use Data::Dumper;
my $re = qr/(FPJ.+?PRD)\.(\d+)\..*?(\d+)$/;
while(<DATA>) {
chomp;
my (@l) = $_ =~ /$re/g;
say Dumper\@l;
}
__DATA__
sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.2644?startid=2644000&endid=2644666
sequence_history/buckets/FPJ.INV_DOM_16_PRD.41987.9616
输出:
$VAR1 = [
'FPJ.INV_DOM_16_PRD',
'47269',
'2644666'
];
$VAR1 = [
'FPJ.INV_DOM_16_PRD',
'41987',
'9616'
];
解释:
( : start group 1
FPJ : literally FPJ
.+? : 1 or more any character but newline, not greedy
PRD : literally PRD
) : end group 1
\. : a dot
( : start group 2
\d+ : 1 or more digit
) : end group 2
\. : a dot
.*? : 0 or more any character not greedy
( : start group 3
\d+ : 1 or more digit
) : end group 3
$ : end of string
如果您试图获取日志文件中的一些条目,那么您可以在 perl 中使用文件句柄。在下面的代码中,我试图从名为 test.log
的日志文件中获取条目日志条目如下。
sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.2644?startid=2644000&endid=2644666
sequence_history/buckets/FPJ.INV_DOM_16_PRD.41987.9616
sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.69886?startid=2644000&endid=26765849
sequence_history/buckets/FPJ.INV_DOM_16_PRD.47269.24465?startid=2644000&endid=836783741
下面是获取所需数据的 perl 脚本。
#!/usr/bin/perl
use strict;
use warnings;
open (FH, "test.log") || die "Not able to open test.log $!";
my ($a,$b,$c);
while (my $line=<FH>)
{
if ($line =~ /sequence_history\/buckets\/.*endid=(\d*)/)
{
$c= ;
if ($line =~ /(FPJ.*PRD)\.(\d*)\.(\d*)\?/)
{
$a=;
$b=;
}
}
else
{
if ($line =~ /sequence_history\/buckets\/(FPJ.*PRD)\.(\d*)\.(\d*)/)
{
$a=;
$b=;
$c=;
}
}
print "\n $a=$a\n $b=$b\n $c=$c \n";
}
输出:
$a=FPJ.INV_DOM_16_PRD
$b=47269
$c=2644666
$a=FPJ.INV_DOM_16_PRD
$b=41987
$c=9616
$a=FPJ.INV_DOM_16_PRD
$b=47269
$c=26765849
$a=FPJ.INV_DOM_16_PRD
$b=47269
$c=836783741
您可以通过将 "test.log" 替换为您要从中获取数据的日志文件名(及其路径)来使用上述代码,如下所示。
open (FH, "/path/to/log/file/test.log") || die "Not able to open test.log $!";