Python 替换单引号或双引号之间单词的模式

Question

我是 Python 的新手，对正则表达式很不熟悉。我的要求是修改现有代码中的模式

我已经提取了我要修复的代码。

def replacer_factory(spelling_dict):
    def replacer(match):
        word = match.group()
        return spelling_dict.get(word, word)
    return replacer

def main():
    repkeys = {'modify': 'modifyNew', 'extract': 'extractNew'}
    with open('test.xml', 'r') as file :
        filedata = file.read()
    pattern = r'\b\w+\b' # this pattern matches whole words only
    #pattern = r'[\'"]\w+[\'"]'
    #pattern = r'["]\w+["]' 
    #pattern = '\b[\'"]\w+[\'"]\b'
    #pattern = '(["\'])(?:(?=(\?)).)*?'

    replacer = replacer_factory(repkeys)
    filedata = re.sub(pattern, replacer, filedata)

if __name__ == '__main__':
    main()

输入

<fn:modify ele="modify">
<fn:extract name='extract' value="Title"/>
</fn:modify>

预期输出。请注意，替换词可以用单引号或双引号括起来。

<fn:modify ele="modifyNew">
<fn:extract name='extractNew' value="Title"/>
</fn:modify>

现有模式 r'\b\w+\b' 导致例如 <fn:modifyNew ele="modifyNew">，但我正在寻找的是 <fn:modify ele="modifyNew">

到目前为止我尝试过的模式已作为评论给出。我后来意识到他们中的一些是错误的，因为以 r 为前缀的字符串文字用于反斜杠等的特殊处理。我仍然包括他们来回顾我到目前为止所做的一切。

如果我能得到一个模式来解决这个问题，而不是改变逻辑，那就太好了。如果用现有的代码无法做到这一点，也请指出。我工作的环境有Python2.6

感谢任何帮助。

Answer 1

您需要使用 r'''(['"])(\w+)''' 正则表达式，然后您需要调整替换方法：

def replacer_factory(spelling_dict):
    def replacer(match):
        return '{0}{1}{0}'.format(match.group(1), spelling_dict.get(match.group(2), match.group(2)))
    return replacer

您与 (['"])(\w+) 匹配的单词是双引号或单引号，但该值在第 2 组中，因此使用 spelling_dict.get(match.group(2), match.group(2))。此外，引号必须放回去，因此 '{0}{1}{0}'.format().

参见 Python demo:

import re
def replacer_factory(spelling_dict):
    def replacer(match):
        return '{0}{1}{0}'.format(match.group(1), spelling_dict.get(match.group(2), match.group(2)))
    return replacer

repkeys = {'modify': 'modifyNew', 'extract': 'extractNew'}
pattern = r'''(['"])(\w+)'''
replacer = replacer_factory(repkeys)
filedata = """<fn:modify ele="modify">
<fn:extract name='extract' value="Title"/>
</fn:modify>"""
print( re.sub(pattern, replacer, filedata) )

输出：

<fn:modify ele="modifyNew">
<fn:extract name='extractNew' value="Title"/>
</fn:modify>

Python 替换单引号或双引号之间单词的模式

Python pattern to replace words between single or double quotes

python

regex

python-2.6