如何在单个 perl 正则表达式中同时执行负前瞻和负后瞻?
how can I perform both negative lookahead and negative lookbehind in a single perl regx?
在多行字符串中,在每一行中,我想删除从第一个未转义的百分号到行尾的所有内容; 有一个例外。如果未转义的百分号出现在以下位置:\d\d:\d\d%:\d\d
,那么我想不管它。
(字符串是 LaTeX / TeX 代码,百分号表示注释。我想将 HH:MM:SS 字符串中的注释视为一种特殊情况,其中秒数是从时间字符串中注释掉的。 )
下面的代码几乎可以做到:
- 它使用一种消极的回顾来让
\%
独自一人
- 它使用“ungreedy”来匹配第一个,而不是最后一个,
%
- 它使用另一个负面回顾来跳过
\d\d:\d\d%
- 但它无法区分
\d\d:\d\d%anything
和 \d\d:\d\d%\d\d
,跳过两者。
- 我尝试添加否定前瞻没有帮助。有办法吗?
#!/usr/bin/perl
use strict; use warnings;
my $string = 'for 10\% and %delete-me
for 10\% and 2021-03-09 Tue 02:59%:02 NO DELETE %delete-me
for 10\% and 2021-03-09 Tue 04:09%anything %delete-me
for 10 percent%delete-me';
print "original string:\n";
print "$string<<\n";
{
my $tochange = $string;
$tochange =~ s/
(^.*?
(?<!\)
)
(\%.*)
$//mgx;
print "\ndelete after any unescaped %\n";
print "$tochange<<\n";
}
{
my $tochange = $string;
$tochange =~ s/
(^.*?
(?<!\d\d:\d\d)
(?<!\)
)
(\%.*)
$//mgx;
print "\nexception for preceding HH:MM\n";
print "$tochange<<\n";
}
{
my $tochange = $string;
$tochange =~ s/
(^.*?
(?<!\d\d:\d\d)
(?<!\)
)
(!?:\d\d)
(\%.*)
$//mgx;
print "\nattempt to add negative lookahead\n";
print "$tochange<<\n";
}
{
my $tochange = $string;
# attempt to add negative lookahead
$tochange =~ s/
(^.*?
(?<!\d\d:\d\d)
(?<!\)
)
(\%.*)
(!?:\d\d)
$//mgx;
print "\nattempt to add negative lookahead\n";
print "$tochange<<\n";
}
您可以使用 SKIP FAIL 方法:
\d\d:\d\d%:\d\d(*SKIP)(*FAIL)|(?<!\)%.*
\d\d:\d\d%:\d\d(*SKIP)(*FAIL)|
匹配您要避免的模式
(?<!\)%.*
否定向后看,直接向左断言不 \
并匹配 %
后跟行的其余部分
例如
$tochange =~ s/\d\d:\d\d%:\d\d(*SKIP)(*FAIL)|(?<!\)%.*//g;
在多行字符串中,在每一行中,我想删除从第一个未转义的百分号到行尾的所有内容; 有一个例外。如果未转义的百分号出现在以下位置:\d\d:\d\d%:\d\d
,那么我想不管它。
(字符串是 LaTeX / TeX 代码,百分号表示注释。我想将 HH:MM:SS 字符串中的注释视为一种特殊情况,其中秒数是从时间字符串中注释掉的。 )
下面的代码几乎可以做到:
- 它使用一种消极的回顾来让
\%
独自一人 - 它使用“ungreedy”来匹配第一个,而不是最后一个,
%
- 它使用另一个负面回顾来跳过
\d\d:\d\d%
- 但它无法区分
\d\d:\d\d%anything
和\d\d:\d\d%\d\d
,跳过两者。 - 我尝试添加否定前瞻没有帮助。有办法吗?
#!/usr/bin/perl
use strict; use warnings;
my $string = 'for 10\% and %delete-me
for 10\% and 2021-03-09 Tue 02:59%:02 NO DELETE %delete-me
for 10\% and 2021-03-09 Tue 04:09%anything %delete-me
for 10 percent%delete-me';
print "original string:\n";
print "$string<<\n";
{
my $tochange = $string;
$tochange =~ s/
(^.*?
(?<!\)
)
(\%.*)
$//mgx;
print "\ndelete after any unescaped %\n";
print "$tochange<<\n";
}
{
my $tochange = $string;
$tochange =~ s/
(^.*?
(?<!\d\d:\d\d)
(?<!\)
)
(\%.*)
$//mgx;
print "\nexception for preceding HH:MM\n";
print "$tochange<<\n";
}
{
my $tochange = $string;
$tochange =~ s/
(^.*?
(?<!\d\d:\d\d)
(?<!\)
)
(!?:\d\d)
(\%.*)
$//mgx;
print "\nattempt to add negative lookahead\n";
print "$tochange<<\n";
}
{
my $tochange = $string;
# attempt to add negative lookahead
$tochange =~ s/
(^.*?
(?<!\d\d:\d\d)
(?<!\)
)
(\%.*)
(!?:\d\d)
$//mgx;
print "\nattempt to add negative lookahead\n";
print "$tochange<<\n";
}
您可以使用 SKIP FAIL 方法:
\d\d:\d\d%:\d\d(*SKIP)(*FAIL)|(?<!\)%.*
\d\d:\d\d%:\d\d(*SKIP)(*FAIL)|
匹配您要避免的模式(?<!\)%.*
否定向后看,直接向左断言不\
并匹配%
后跟行的其余部分
例如
$tochange =~ s/\d\d:\d\d%:\d\d(*SKIP)(*FAIL)|(?<!\)%.*//g;