使用 Python 删除字符串中相邻的重复单词？

Question

如何删除字符串中相邻的重复单词。例如 'Hey there There' -> 'Hey there'

Answer 1

使用带有反向引用的re.sub我们可以尝试：

inp = 'Hey there There'
output = re.sub(r'(\w+) ', r'', inp, flags=re.IGNORECASE)
print(output)  # Hey there

此处使用的正则表达式模式表示：

(\w+)  match and capture a word
[ ]    followed by a space
     then followed by the same word (ignoring case)

然后，我们只用第一个相邻的词替换。

Answer 2

inp = 'Hey there There'
output = re.sub(r'\b(\w+) \b', r'', inp, flags=re.IGNORECASE)
print(output)  # Hey there

inp = 'Hey there eating?'
output = re.sub(r'\b(\w+) \b', r'', inp, flags=re.IGNORECASE)
print(output)  # Hey there eating?

\b 确保单词边界并捕获整个单词而不是字符。第二个测试用例（“嘿，在吃饭吗？”）不适用于 Tim Biegeleisen 给出的答案。

使用 Python 删除字符串中相邻的重复单词？

Remove adjacent duplicate words in a string with Python?

python

string

duplicates