用于替换 Python 中字符串中相似模式的正则表达式

Question

我想使用正则表达式来检测和替换一些短语。这些短语遵循相同的模式，但在某些点上有所不同。所有的短语都在同一个字符串中。

例如我有这个字符串：

/this/is//an example of what I want /to///do

我想捕获里面的所有单词，包括 // 并将它们替换为“”。

为了解决这个问题，我使用了以下代码：

import re
txt = "/this/is//an example of what i want /to///do"
re.search("/.*/",txt1, re.VERBOSE)
pattern1 = r"/.*?/\w+"
a = re.sub(pattern1,"",txt)

结果是：

' example of what i want '

这就是我想要的，即将//里面的词组替换为“”。但是当我运行下面句子中的相同模式

"/this/is//an example of what i want to /do"

我明白了

' example of what i want to /do'

我怎样才能使用一个正则表达式并删除所有短语和//，而不考虑短语中 // 的数量？

Answer 1

你可以使用

/(?:[^/\s]*/)*\w+

参见regex demo。详情:

/ - 斜杠
(?:[^/\s]*/)* - 除斜杠和空格外的任何字符的零次或多次重复
\w+ - 一个或多个单词字符。

查看 Python demo:

import re
rx = re.compile(r"/(?:[^/\s]*/)*\w+")
texts = ["/this/is//an example of what I want /to///do", "/this/is//an example of what i want to /do"]
for text in texts:
    print( rx.sub('', text).strip() ) 
# => example of what I want
#    example of what i want to

Answer 2

在您的示例代码中，您可以省略这部分 re.search("/.*/",txt1, re.VERBOSE) 作为执行命令，但您不会对结果做任何事情。

您可以匹配 1 个或多个 / 后跟单词字符：

/+\w+

或更广泛的匹配，匹配一个或多个 / 后跟除 / 以外的所有字符或空白字符：

/+[^\s/]+

/+ 匹配出现 1 次以上的 /
[^\s/]+ 匹配除空白字符或 /

Regex demo

import re

strings = [
    "/this/is//an example of what I want /to///do",
    "/this/is//an example of what i want to /do"
]

for txt in strings:    
    pattern1 = r"/+[^\s/]+"
    a = re.sub(pattern1, "", txt)
    print(a)

输出

 example of what I want 
 example of what i want to

用于替换 Python 中字符串中相似模式的正则表达式

Regular expression for substitution of similar pattern in a string in Python

python

regex

python-re