用于匹配最后一行的正则表达式

Question

我正在配置一个正则表达式（因此为了满足这个要求而切换到原始代码并不那么容易），以获取最后一行输入，如果我使用 /.*$/ 它会变得很慢对于一些输入，例如js'1'.repeat(1e6)+'\n2'.有没有快速获取最后一行的方法？

此外，如果使用 RegEx 作为匹配配置不是一个好主意，是否有更好的建议？

Answer 1

用于查找大型输入字符串的最后一行的优化表达式将是引入显式边界的表达式：

(?m)^.*\z

在像PHP这样的语言中，它会写成/^.*\z/m（/s是定界符，m是多行标志）。如果不匹配，插入符号 ^ 使引擎不会通过 .* （邪恶的）正则表达式。所以我们定义了一个众所周知的边界，不仅让我们识别所需的部分，也让引擎及其内置优化。

此正则表达式的性能取决于输入字符串的行数。所以输入字符串 like yours isn't a problem at all but something like this 会引起一些注意。

在这两种情况下，它都执行得很快并且不会失败。

Answer 2

让我们尝试一些测试...
字符串目标 = 127,000 字节，1,057 行

Regex1:   .*$
Options:  < none >
Completed iterations:   5  /  5     ( x 1000 )
Matches found per iteration:   1
Elapsed Time:    12.74 s,   12743.09 ms,   12743087 µs
Matches per sec:   392


Regex2:   .*\z
Options:  < none >
Completed iterations:   5  /  5     ( x 1000 )
Matches found per iteration:   1
Elapsed Time:    12.77 s,   12765.26 ms,   12765260 µs
Matches per sec:   391


Regex3:   \z
Options:  < none >
Completed iterations:   5  /  5     ( x 1000 )
Matches found per iteration:   1
Elapsed Time:    6.41 s,   6410.85 ms,   6410854 µs
Matches per sec:   779


Regex4:   $
Options:  < none >
Completed iterations:   5  /  5     ( x 1000 )
Matches found per iteration:   1
Elapsed Time:    6.36 s,   6364.10 ms,   6364098 µs
Matches per sec:   785

用于匹配最后一行的正则表达式

RegEx for matching the last line

regex

performance

regex-greedy

regex-lookarounds