如何删除 python 中两个特定单词之间的文本

How to remove text between two specfic words in python

我已经解析了 url 以使用漂亮的汤包获取其文本。我想删除条款和条件部分中的所有文本,即段落 "Key terms: ......... T&Cs apply."

中的所有单词

以下是我试过的:

import re

#"text" is part of the text contained in the url
text="Welcome to Company Key.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
Key Terms; Single bets only. Any returns from the free bet will be paid 
back into your account minus the free bet stake. Free bets can only be 
placed at maximum odds of 5.00 (4/1). Bonus will expire midnight, Tuesday 
26th February 2019. Bonus T&Cs and General T&Cs apply.                                                                                                                                                                                                                                                    
"
rex=re.compile('Key\ (.*?)T&Cs.')"""to remove words between "Key" and 
"T&Cs" """
terms_and_cons=rex.findall(text)
text=re.sub("|".join(terms_and_cons)," ",text)
#I also tried: text=re.sub(terms_and_cons[0]," ",text)
print(text)

上面只保留字符串 'text' 不变,即使列表 "terms_and_cons" 是非空的。如何成功删除 "Key" 和 "T&Cs" 之间的单词?请帮我。我已经在这段应该很简单的代码上停留了很长一段时间,它变得非常令人沮丧。谢谢。

您的正则表达式中缺少 re.DOTALL 标志,无法将换行符与点匹配。

方法一:使用re.sub

import re

text="""Welcome to Company Key.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
Key Terms; Single bets only. Any returns from the free bet will be paid 
back into your account minus the free bet stake. Free bets can only be 
placed at maximum odds of 5.00 (4/1). Bonus will expire midnight, Tuesday 
26th February 2019. Bonus T&Cs and General T&Cs apply.                                                                                                                                                                                                                                                    
"""

rex = re.compile("Key\s(.*)T&Cs", re.DOTALL)
text = rex.sub("Key T&Cs", text)
print(text)

方法二:使用群组

将文本与一个组匹配,并从原始文本中删除该组的文本。

import re

text="""Welcome to Company Key.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
Key Terms; Single bets only. Any returns from the free bet will be paid 
back into your account minus the free bet stake. Free bets can only be 
placed at maximum odds of 5.00 (4/1). Bonus will expire midnight, Tuesday 
26th February 2019. Bonus T&Cs and General T&Cs apply.                                                                                                                                                                                                                                                    
"""

rex = re.compile("Key\s(.*)T&Cs", re.DOTALL)
matches = re.search(rex, text)
text = text.replace(matches.group(1), "")
print(text)