仅替换文件中多次出现的匹配组

Question

输入： /* ABCD X 1111 */ /* Comment 1111: [[reason for comment]] */

输出： /* ABCD X 1111 # [[reason for comment]] */

使用正则表达式：regex = (?:[\/*]+\sPRQA[\s\w\,]*)(\*\/\s*\/\*\Comment[\w\,]+:)+(?:\s\[\[.*\/$)

如何使用上述正则表达式将文件中多次出现的匹配组替换为'#'？

我尝试使用 re.sub(regex, '#', file.read(), re.MULTILINE)，但这会将 # 附加到匹配的组。

有没有直接的方法来做到这一点，而不是逐行迭代然后替换？

Answer 1

你可以使用

re.sub(r'(/\*\s*ABCD[^*/]*)\*/\s*/\*\s*Comment[^*:]+:(\s*\[\[[^][]*]]\s*\*/)', r'#', file.read())

如果您确定这些子字符串只出现在行尾，请添加您的 $ 锚点并使用 flags=re.M:

re.sub(r'(/\*\s*ABCD[^*/]*)\*/\s*/\*\s*Comment[^*:]+:(\s*\[\[[^][]*]]\s*\*/)$', r'#', file.read(), flags=re.M)

参见regex demo。详情:

(/\*\s*ABCD[^*/]*) - 第 1 组 (</code>)：<code>/*，零个或多个空格，ABCD，然后是 [=20 以外的任何零个或多个字符=] 和 /
\*/\s*/\*\s*Comment[^*:]+: - */，零个或多个空格，/，零个或多个空格，Comment，除 * 之外的一个或多个字符和 : 然后 :
(\s*\[\[[^][]*]]\s*\*/) - 第 2 组 (</code>)：零个或多个空格，<code>[[，除 [ 和 ] 以外的零个或多个字符， ]]，零个或多个空格，*/。

见Python demo:

import re
rx = r'(/\*\s*ABCD[^*/]*)\*/\s*/\*\s*Comment[^*:]+:(\s*\[\[[^][]*]]\s*\*/)$'
text = "Some text ... /* ABCD X 1111 */ /* Comment 1111: [[reason for comment]] */\nMore text here... Some text ... /* ABCD XD 1222 */ /* Comment 1112: [[reason for comment 2]] */"
print( re.sub(rx, r'#', text, flags=re.M) )

输出：

Some text ... /* ABCD X 1111 # [[reason for comment]] */
More text here... Some text ... /* ABCD XD 1222 # [[reason for comment 2]] */

仅替换文件中多次出现的匹配组

Replacing only the matched group in a file with multiple occurences

python

regex

regex-group

python-3.x

regexp-replace