否定模式匹配 Regex In Python

Negative pattern matching Reg ex In Python

尝试使用否定正向替换所有不匹配模式的字符串:

regexPattern = '((?!*' + 'word1|word2|word3' + ').)*$'  
mytext= 'jsdjsqd word1dsqsqsword2fjsdjswrod3sqdq'
return re.sub(regexPattern, "P", mytext)

#Expected Correct Output:  'PPPPPPword1PPPPPPword2PPPPPword3PPP'

#BAD Output:  'jsdjsqd word1dsqsqsword2fjsdjswrod3sqdq'

我试过了,但它不起作用(字符串保持不变)。 如何修改呢? (认为​​这是相当困难的正则表达式)

您可以使用

import re
regex = re.compile(r'(word1|word2|word3)|.', re.S)
mytext = 'jsdjsqd word1dsqsqsword2fjsdjsword3sqdq'
print(regex.sub(lambda m: m.group(1) if m.group(1) else "P", mytext))
// => PPPPPPPPword1PPPPPPword2PPPPPPword3PPPP

IDEONE demo

正则表达式是 (word1|word2|word3)|.:

  • (word1|word2|word3) - word1word2word3 字符序列
  • | - 或者...
  • . - 任何字符(包括换行符,因为 re.S DOTALL 模式打开)

regex demo

您可以使用两阶段方法:首先,将 do 匹配的字符替换为一些特殊字符,然后将其用作掩码以替换所有 [=16] =]其他个字符。

>>> text= 'jsdjsqd word1dsqsqsword2fjsdjsword3sqdq'
>>> p = 'word1|word2|word3'
>>> mask = re.sub(p, lambda m: 'X' * len(m.group()), text)
>>> mask
'jsdjsqd XXXXXdsqsqsXXXXXfjsdjsword3sqdq'
>>> ''.join(t if m == 'X' else 'P' for (t, m) in zip(text, mask))
'PPPPPPPPword1PPPPPPword2PPPPPPword3PPPP'

当然,您可能必须选择其他字符而不是 X,原始字符串中不会出现该字符。