Perl：按顺序应用正则表达式

Question

我需要帮助编写一个 Perl 单行程序

在文件中找到一个字符串并
从该字符串中提取浮点数或指数。

例如，我有一个名为 results.log:

的文本文件

...
TOL: 0.0244141
ort: 0.000282395
Q orthogonality: True

EPS: 0.000488281
err: 9.58692e-05
QR decomposition: True

Success: True
...

它包含数值实验的结果。我想找到以TOL:开头的行并提取公差值0.0244141。我可以写一个单行代码来找到以 TOL:

开头的行

perl -ne '/TOL:/ && print' results.log
TOL: 0.0244141

我能找到包含浮点数的一行0.0244141

echo "TOL: 0.0244141" | perl -ne '/\d+.\d+/ && print'

有没有办法将两个正则表达式“堆叠”在一起，并依次应用它们来提取数值本身？换句话说，是否可以将正则表达式应用于前面的正则表达式的结果？

为了完成任务，我想从 Perl 脚本调用这个单行代码并将提取的结果存储到一个变量中：

my $tol = system( qq{ perl -ne '... && print' results.log } );

Answer 1

如果我理解正确的话，你只需要连接你已经得到的正则表达式：

perl -ne '/TOL: (\d+.\d+)/ && print  . "\n"' results.log

输出：

0.0244141

括号使其捕获其中匹配的所有内容。每对 ( ... ) 将匹配的内容分配给一个新的编号变量。第一场比赛 </code>，第二场比赛 <code>，依此类推

关于该主题的更多信息：Capture groups

如果您希望将其作为现有 perl 脚本的一部分，请不要使用 system() 启动另一个 perl 解释器。只需从现有脚本中打开文件。这是我将其放入 sub 例程的示例。

sub print_TOL {
    # extract the first argument to the function
    my $filename = shift;

    # open the file - or `die` with an error message
    open my $fh, '<', $filename or die "[=12=]: ERROR: $filename: $!";

    # read line by line from the file into $_
    while(<$fh>) {
        if( /TOL: (\d+.\d+)/ ) {  # same match as before
            print  . "\n";
            # If you only want to print the first match, use "last;" here.
            #last;  
        }
    }
}

print_TOL 'results.log';

Answer 2

一个不错且灵活的解决方案是将值读入哈希，然后您可以随意使用值。

use strict;
use warnings;

my $log = "results.log";
open my $fh, "<", $log or die "Cannot open $log: $!";
my %log;     # declare variable to store values

while (<$fh>) {   # while we can read a line from the file
    chomp;        # remove newline
    my ($key, $val) = split / *: */, $_, 2;   # split the line on :, also remove whitespace
    next unless defined $val;     # skip lines which do not contain values
    $log{$key} = $val;            # store the value in the appropriate key
}

print $log{TOL};    # <--- value is in $log{TOL}

文件中的所有值都存储在 %log 中。当然，如果你只是对TOL值感兴趣，可以直接做

my $tol;
while (<$fh>) {
    if (/^TOL: (.+)/) {
        $tol = ;
        last;              # skip to end
    }
}

与不使用 shell 调用相比的好处是错误控制更快更容易。

Perl：按顺序应用正则表达式

Perl: apply regular expressions in sequence

regex

perl

text-extraction