使用 re.sub 在 Python 中查找斜体文本

Question

要点是，我正在制作一个函数，使用 re.sub 删除斜体文本并复制文本。该函数有一个名为 sentence 的参数，其中包含一个字符串。

举几个例子：

sentence = <i>All of this text is italicized.</i>
Return value = "All of this text is italicized. All of this text is italicized."

sentence = <i>beep</i><i>bop</i><i>boop</i><i>bonk</i>
Return value: "beep beepbop bopboop boopbonk bonk"

sentence = "I <i>Like</i>, food because <i>it's so great</i>!"
return value: "I Like Like food because it's so great it's so great!".

这是我目前的情况：

pattern = r'<.*?>'
return re.sub(pattern, i, sentence)

有人可以帮忙吗？

Answer 1

首先，您的模式是错误的 - 它匹配从第一个 < 到最后一个 > 的所有内容，这显然不是您想要的。其次，for i in sentence 没有意义 - 遍历字符串会得到字符串的单个字符，无论如何都不会匹配您的模式。

然而，这似乎符合您的要求：

return re.sub('<i>(.*?)</i>', r' ', sentence)

</code> 是对第一个捕获组的引用，即。 <code>(.*?),已配对,重复使用,达到倍增效果

使用 re.sub 在 Python 中查找斜体文本

Looking for Italicized Text In Python using re.sub

python

python-re