负先行语法与正先行语法

Question

给定以下文本：

My name is foo.

My name is bar.

目标是 return 包含或不包含特定子字符串的每一行，以下正负正则表达式模式均可用于 return 相同的结果：

正前瞻：^(?=.*bar).*$ returns My name is bar.

负前瞻：^((?!foo).)*$ returns My name is bar.

但是，为什么负前瞻需要嵌套在多组括号内，限定符 . 和量词 * 由括号分隔，而在正前瞻中，它们可以是相邻 .*?

Answer 1

否定先行需要嵌套在多组括号内，限定符.和量词*被称为千锤百炼的贪心令牌。在这种情况下您不必使用它。

您可以使用锚定在开头的正常前瞻而不是 tempered greedy token:

^(?!.*foo).*$

见regex demo

这里，

^ - 匹配字符串开头的位置
(?!.*foo) - 如果在线某处有 foo 则匹配失败的否定前瞻（如果 DOTALL 模式打开则为字符串）
.*$ - 任何 0+ 个字符（但如果 DOTALL 模式关闭则换行）直到 string/line.

用什么？

经过锻炼的贪婪令牌通常效率较低。当您只需要检查字符串是否包含某些内容时，请使用锚定在开头的前瞻。但是，在某些情况下可能需要经过调和的贪婪令牌。参见 When to Use this Technique。

Answer 2

例子

给定字符串 text = 'a123b456c'，我们想使用子字符串 '123' 作为锚点

(?=123) Positive lookahead:    Matches substring '123' as a *forward* anchor 
(?<=123) Positive lookbehind:  Matches substring '123' as a *backward* anchor
(?!123) Negative lookahead:    Substring not matching '123' as a *forward* anchor
(?<!123) Negative lookbehind:  Substring not matching '123' as a *backward* anchor

'123'仅作为锚点使用，不被替换。看看它是如何工作的：

import re 

text = 'a123b456c'

re.sub('a(?=123)', '@', text) # outputs '@123b456c' note '123' not replaced
re.sub('(?<=123)b', '@', text) # outputs 'a123@456c' 
re.sub('b(?!123)', '@', text) # outputs 'a123@456c' since '456' not match '123'
re.sub('(?<!123)c', '@', text) # outputs 'a123b456@'

希望对您有所帮助

负先行语法与正先行语法

Negative lookahead vs Positive lookahead syntax

regex

lookahead

negative-lookahead

例子