如何捕捉以破折号及其后面的单词结尾的单词？正则表达式

Question

我有这样的文字：

IN THE 18th century, suicide was regard- ed, particularly by the French, as an English disease. 'The English destroy themselves most unaccountably,' wrote Montesquieu, and Voltaire was told that during an East wind the English hanged themselves by the dozen. True or not, the chaussure is now on the other foot. The suicide rate for men in England and Wales is about 10 per 100,000 inhabitants, com- pared with 30 in France

我想捕捉这些实例：

regard- ed
com- pared

我已经尝试 r'\s[a-z].*-[a-z].*\s' 抓住一个词，然后是破折号和另一个词，但它不正确。

我试过 r'\s[a-z].*-' 并且成功了：

 by the French, as an English disease. 'The English destroy themselves most unaccountably,' wrote Montesquieu, and Voltaire was told that during an East wind the English hanged themselves by the dozen. True or not, the chaussure is now on the other foot. The suicide rate for men in England and Wales is about 10 per 100,000 inhabitants, com-

然后我尝试了：r'-\s[a-z].*\s' 它捕获了：

- pared with 30 in France.

我本可以尝试：

left = re.find(r'\s[a-z].*-', text).rpartition(' ')[2] 
right = re.find(r'-\s[a-z].*\s', text).partition(' ')[0]  
left[:-1] + right[2:]

但我确信有一种单一的正则表达式方法，无需所有分区混乱即可做到这一点。那么我如何使用单个正则表达式捕获所需的实例呢？（假设单词末尾的破折号总是表示所需的实例，但不需要用空格填充的破折号，例如 com - pared）

Answer 1

我第一次尝试匹配我在文本中实际看到的正确 RE 模式：

r'\w+- \w+'

似乎工作得很好。具有此模式的 re.findall returns

['regard- ed', 'com- pared']

Answer 2

尝试将您的文本存储在像

这样的字符串中

str = 'Suicide was regard- ed'

然后是

def finddashes(text)
for x in range(0, len(text)):
    if text[x] == '-':
        if text[x + 1] == ' '
            if text[x - 1] != ' '
                return True
    else:
        return False

如何捕捉以破折号及其后面的单词结尾的单词？正则表达式

How to catch words that ends with dash and its following word? Regex

python

regex

string

hyphen