正则表达式 Python 替换情侣名字

Question

我想查找“John and Jane Doe”之类的表达并将其替换为“John Doe and Jane Doe”

示例表达式

regextest = 'Heather Robinson, Jane and John Smith, Kiwan and Nichols Brady John, Jimmy Nichols, Melanie Carbone, and Nancy Brown'

我可以找到表达式并将其替换为固定字符串，但我无法通过修改原始文本来替换它。

re.sub(r'[a-zA-Z]+\s*and\s*[a-zA-Z]+.[^,]*',"kittens" ,regextest)

Output: 'Heather Robinson, kittens, kittens, Jimmy Nichols, Melanie Carbone, and Nancy Brown'

我认为我们可以传递一个可以进行更改的函数，而不是字符串 ("kittens")，但我无法编写该函数。我收到以下错误。

def re_couple_name_and(m):
    return f'*{m.group(0).split()[0]+m.group(0).split()[-1:]+ m.group(0).split()[1:]}'
    
re.sub(r'[a-zA-Z]+\s*and\s*[a-zA-Z]+.[^,]*',re_couple_name_and ,regextest)

Answer 1

IIUC，一种使用捕获组的方法：

def re_couple_name_and(m):
    family_name = m.group(3).split(" ",1)[1]
    return "%s %s" % (m.group(1), family_name) + m.group(2) + m.group(3) 

re.sub(r'([a-zA-Z]+)(\s*and\s*)([a-zA-Z]+.[^,]*)',re_couple_name_and ,regextest)

输出：

'Heather Robinson, Jane Smith and John Smith, Kiwan Brady John and Nichols Brady John, Jimmy Nichols, Melanie Carbone, and Nancy Brown'

Answer 2

您可以使用下面的正则表达式捕获要互换的项目并使用re.sub()构造新字符串。

(\w+)( +and +)(\w+)( +[^,]*)

Demo

例子

import re
text="Heather Robinson, Jane and John Smith, Kiwan and Nichols Brady John, Jimmy Nichols, Melanie Carbone, and Nancy Brown"
print(re.sub(r"(\w+)( +and +)(\w+)( +[^,]*)",r"",text))

输出

Heather Robinson, Jane Smith and John Smith, Kiwan Brady John and Nichols Brady John, Jimmy Nichols, Melanie Carbone, and Nancy Brown

正则表达式 Python 替换情侣名字

Regular Expressions Python replacing couple names

python-3.x

python-re