如何从 python 中的字符串中删除子字符串“-”,但保留“-”子字符串?
How to remove substrings "- " from a string in python, but keeping " - " substring?
示例:
string = " a lot of text ... protective equip- ment ... a lot of text - with similar broken words like simple appli- cations ..."
我需要获取相同的文本,但是 设备 变成了 设备,应用程序 成为 应用程序 。
谢谢
如果要删除 2 个单词之间的 '- '
,可以使用以下正则表达式:
>>> import re
>>> string = " a lot of text ... protective equip- ment ... a lot of text - with similar broken words like simple appli- cations ..."
>>> re.sub(r"(\w+)- (\w+)", r"", string)
' a lot of text ... protective equipment ... a lot of text - with similar broken words like simple applications ...'
一个正则表达式需要一个连字符后跟 space,但如果它前面有一个 space,则拒绝它,将达到目的:
import re
string = "a lot of text ... protective equip- ment ... a lot of text - with similar broken words like simple appli- cations ..."
print(re.sub(r"(?<! )- ", "", string))
输出:
a lot of text ... protective equipment ... a lot of text - with similar broken words like simple applications ...
示例:
string = " a lot of text ... protective equip- ment ... a lot of text - with similar broken words like simple appli- cations ..."
我需要获取相同的文本,但是 设备 变成了 设备,应用程序 成为 应用程序 。 谢谢
如果要删除 2 个单词之间的 '- '
,可以使用以下正则表达式:
>>> import re
>>> string = " a lot of text ... protective equip- ment ... a lot of text - with similar broken words like simple appli- cations ..."
>>> re.sub(r"(\w+)- (\w+)", r"", string)
' a lot of text ... protective equipment ... a lot of text - with similar broken words like simple applications ...'
一个正则表达式需要一个连字符后跟 space,但如果它前面有一个 space,则拒绝它,将达到目的:
import re
string = "a lot of text ... protective equip- ment ... a lot of text - with similar broken words like simple appli- cations ..."
print(re.sub(r"(?<! )- ", "", string))
输出:
a lot of text ... protective equipment ... a lot of text - with similar broken words like simple applications ...