使用 python 和正则表达式从 reddit 评论中删除链接

Question

我想删除 Reddit 使用格式的链接

comment = "Hello this is my [website](https://www.google.com)"

no_links = RemoveLinks(comment)

# no_links == "Hello this is my website"

我找到了，但我不知道如何将其翻译成 python。

我不太熟悉正则表达式，所以如果你能解释一下发生了什么，我将不胜感激。

Answer 1

您可以执行以下操作：

import re

pattern = re.compile('\[(.*?)\]\(.*?\)')
comment = "Hello this is my [website](https://www.google.com)"

print(pattern.sub(r'', comment))

行：

pattern = re.compile('\[(.*?)\]\(.*?\)')

创建一个正则表达式模式，它将搜索方括号包围的任何内容，后跟括号包围的任何内容，'?' 表示它们应该匹配尽可能少的文本（非贪婪）。

函数 sub(r'', comment) 用第一个捕获组替换匹配项，在本例中是括号内的文本。

有关正则表达式的更多信息，我建议您阅读 this。

Removing links from a reddit comments using python and regex