否定模式匹配 Regex In Python
Negative pattern matching Reg ex In Python
尝试使用否定正向替换所有不匹配模式的字符串:
regexPattern = '((?!*' + 'word1|word2|word3' + ').)*$'
mytext= 'jsdjsqd word1dsqsqsword2fjsdjswrod3sqdq'
return re.sub(regexPattern, "P", mytext)
#Expected Correct Output: 'PPPPPPword1PPPPPPword2PPPPPword3PPP'
#BAD Output: 'jsdjsqd word1dsqsqsword2fjsdjswrod3sqdq'
我试过了,但它不起作用(字符串保持不变)。
如何修改呢? (认为这是相当困难的正则表达式)
您可以使用
import re
regex = re.compile(r'(word1|word2|word3)|.', re.S)
mytext = 'jsdjsqd word1dsqsqsword2fjsdjsword3sqdq'
print(regex.sub(lambda m: m.group(1) if m.group(1) else "P", mytext))
// => PPPPPPPPword1PPPPPPword2PPPPPPword3PPPP
正则表达式是 (word1|word2|word3)|.
:
(word1|word2|word3)
- word1
或 word2
或 word3
字符序列
|
- 或者...
.
- 任何字符(包括换行符,因为 re.S
DOTALL 模式打开)
您可以使用两阶段方法:首先,将 do 匹配的字符替换为一些特殊字符,然后将其用作掩码以替换所有 [=16] =]其他个字符。
>>> text= 'jsdjsqd word1dsqsqsword2fjsdjsword3sqdq'
>>> p = 'word1|word2|word3'
>>> mask = re.sub(p, lambda m: 'X' * len(m.group()), text)
>>> mask
'jsdjsqd XXXXXdsqsqsXXXXXfjsdjsword3sqdq'
>>> ''.join(t if m == 'X' else 'P' for (t, m) in zip(text, mask))
'PPPPPPPPword1PPPPPPword2PPPPPPword3PPPP'
当然,您可能必须选择其他字符而不是 X
,原始字符串中不会出现该字符。
尝试使用否定正向替换所有不匹配模式的字符串:
regexPattern = '((?!*' + 'word1|word2|word3' + ').)*$'
mytext= 'jsdjsqd word1dsqsqsword2fjsdjswrod3sqdq'
return re.sub(regexPattern, "P", mytext)
#Expected Correct Output: 'PPPPPPword1PPPPPPword2PPPPPword3PPP'
#BAD Output: 'jsdjsqd word1dsqsqsword2fjsdjswrod3sqdq'
我试过了,但它不起作用(字符串保持不变)。 如何修改呢? (认为这是相当困难的正则表达式)
您可以使用
import re
regex = re.compile(r'(word1|word2|word3)|.', re.S)
mytext = 'jsdjsqd word1dsqsqsword2fjsdjsword3sqdq'
print(regex.sub(lambda m: m.group(1) if m.group(1) else "P", mytext))
// => PPPPPPPPword1PPPPPPword2PPPPPPword3PPPP
正则表达式是 (word1|word2|word3)|.
:
(word1|word2|word3)
-word1
或word2
或word3
字符序列|
- 或者....
- 任何字符(包括换行符,因为re.S
DOTALL 模式打开)
您可以使用两阶段方法:首先,将 do 匹配的字符替换为一些特殊字符,然后将其用作掩码以替换所有 [=16] =]其他个字符。
>>> text= 'jsdjsqd word1dsqsqsword2fjsdjsword3sqdq'
>>> p = 'word1|word2|word3'
>>> mask = re.sub(p, lambda m: 'X' * len(m.group()), text)
>>> mask
'jsdjsqd XXXXXdsqsqsXXXXXfjsdjsword3sqdq'
>>> ''.join(t if m == 'X' else 'P' for (t, m) in zip(text, mask))
'PPPPPPPPword1PPPPPPword2PPPPPPword3PPPP'
当然,您可能必须选择其他字符而不是 X
,原始字符串中不会出现该字符。