在 perl 中使用正则表达式从文本块中提取子字符串或行

Question

我有一个变量，里面有一些文本

$foo = "
    Garbage directory
    /test/this/is/a/directory
    /this/is/another/foo\nThisd is is\nDrop stuff testing\nRandom stuff emacs is great";

如何使用正则表达式获取 /test/this/is/a/directory

行

我试过这个：

my $foo = "
    Garbage directory
    /test/this/is/a/directory
    /this/is/another/foo\nThisd is is\nDrop stuff testing\nRandom stuff emacs is great";
$foo =~ /^\/test.*$/;
print "\n$foo\n";

但这只会继续打印整个文本块。

Answer 1

你的正则表达式应该是：

/\/test.*\n/

原因是因为您匹配的是整个文本，而且行尾没有限制。你需要表达你想要匹配到下一个新行。这个正则表达式包括匹配中的换行符。

对于正则表达式，有多种不同的方法，因此这取决于您要完成的任务的上下文。您可以在末尾添加 m 修饰符。这样做的目的是将字符串视为多行，这样您就可以对每一行而不是整个文本使用 ^$。同样使用 m 多行修饰符不会导致包含换行符的匹配。

/\/test.*/m 就够了。

更多信息：https://perldoc.perl.org/perlre.html

此外，print "$foo"; 不会打印匹配项，因为 =~ 运算符 returns 是真值或假值，并且不会将变量重新分配给匹配项。您需要更改模式匹配的正则表达式并打印第一个匹配项：

$foo =~ /(\/test.*)/m;
print ;

Answer 2

把你的表情改成

$foo =~ m~^\s*/test.*$~m;

参见 a demo on regex101.com。

这使用其他分隔符 (~)，因此您不需要转义 /，另外还有空格 (\s*) 并打开 multiline模式 (m).

Answer 3

OP 似乎想要打印指定的行，而不是整个文本块。为此，我们需要修改 Jan 的答案以捕获和提取实际匹配项。

my $foo = "
    Garbage directory
    /test/this/is/a/directory
    /this/is/another/foo\nThisd is is\nDrop stuff testing\nRandom stuff emacs is great";
$foo =~ m~^(\s*/test.*)$~m;
$foo = ;
print "\n$foo\n"

输出：

/test/this/is/a/directory

在 perl 中使用正则表达式从文本块中提取子字符串或行

Using regex in perl to extract a substring, or line from a blob of text

regex

perl

beginthread