如何在单个 perl 正则表达式中同时执行负前瞻和负后瞻?

how can I perform both negative lookahead and negative lookbehind in a single perl regx?

在多行字符串中,在每一行中,我想删除从第一个未转义的百分号到行尾的所有内容; 有一个例外。如果未转义的百分号出现在以下位置:\d\d:\d\d%:\d\d,那么我想不管它。

(字符串是 LaTeX / TeX 代码,百分号表示注释。我想将 HH:MM:SS 字符串中的注释视为一种特殊情况,其中秒数是从时间字符串中注释掉的。 )

下面的代码几乎可以做到:

  1. 它使用一种消极的回顾来让 \% 独自一人
  2. 它使用“ungreedy”来匹配第一个,而不是最后一个,%
  3. 它使用另一个负面回顾来跳过 \d\d:\d\d%
  4. 但它无法区分 \d\d:\d\d%anything\d\d:\d\d%\d\d,跳过两者。
  5. 我尝试添加否定前瞻没有帮助。有办法吗?
#!/usr/bin/perl
use strict; use warnings;

my $string = 'for 10\% and %delete-me
for 10\% and 2021-03-09 Tue 02:59%:02 NO DELETE %delete-me
for 10\% and 2021-03-09 Tue 04:09%anything  %delete-me
for 10 percent%delete-me';

print "original string:\n";
print "$string<<\n";

{
    my $tochange = $string;
    $tochange =~ s/
        (^.*?
        (?<!\)
        )
        (\%.*)
        $//mgx;
    print "\ndelete after any unescaped %\n";
    print "$tochange<<\n";
}

{
    my $tochange = $string;
    $tochange =~ s/
        (^.*?
        (?<!\d\d:\d\d)
        (?<!\)
        )
        (\%.*)
        $//mgx;
    print "\nexception for preceding HH:MM\n";
    print "$tochange<<\n";
}

{
    my $tochange = $string;
    $tochange =~ s/
        (^.*?
        (?<!\d\d:\d\d)
        (?<!\)
        )
        (!?:\d\d)
        (\%.*)
        $//mgx;
    print "\nattempt to add negative lookahead\n";
    print "$tochange<<\n";
}


{
    my $tochange = $string;
    # attempt to add negative lookahead
    $tochange =~ s/
        (^.*?
        (?<!\d\d:\d\d)
        (?<!\)
        )
        (\%.*)
        (!?:\d\d)
        $//mgx;
    print "\nattempt to add negative lookahead\n";
    print "$tochange<<\n";
}

您可以使用 SKIP FAIL 方法:

\d\d:\d\d%:\d\d(*SKIP)(*FAIL)|(?<!\)%.*
  • \d\d:\d\d%:\d\d(*SKIP)(*FAIL)| 匹配您要避免的模式
  • (?<!\)%.* 否定向后看,直接向左断言不 \ 并匹配 % 后跟行的其余部分

Regex demo | Perl demo

例如

$tochange =~ s/\d\d:\d\d%:\d\d(*SKIP)(*FAIL)|(?<!\)%.*//g;