将文件路径的指定标记内的所有斜杠替换为反斜杠

Question

我有 xml 类文件，其标签如下：

<id>SomeID</id>
<datasource>C:/projects/my_project/my_file.jpg</datasource>
<title>My title can include / and other characters</title>
<abstract></abstract>

我想将所有斜杠更改为反斜杠，但只能在标签 数据源内（在开始和结束标签内）。

执行此操作的一般正则表达式语法是什么？ 更新： 我终于找到了第一个可行的解决方案python：

regex_01 = re.compile(".*<datasource>")
regex_02 = re.compile("</datasource>.*")
file_content = ""        
for line in source_file.readlines():
    if "<datasource>" in line:
        start = regex_01.search(line).group()
        end = regex_02.search(line).group()
        part_to_replace = line.replace(start,"").replace(end,"")
        replaced = part_to_replace.replace("/","\")
        file_content = file_content + start + replaced.strip() + end + "\n"
    else:
        file_content = file_content + line

你能推荐一些更优雅的东西吗？

Answer 1

试试这个：

(?=ce>)[\s\S]*?(?<=<\/d)

演示：https://regex101.com/r/VVcUMy/2

Answer 2

您可以尝试使用 skip/fail 语法：

(?:<datasource>[^/]*?|.*(?=<datasource>)|(?=</datasource>).*)(*SKIP)(*FAIL)|/

看到它在这里工作：https://regex101.com/r/86gc4d/1。

但是这个是给 PCRE 的。在python中，(*FAIL)也可以是(?!)，但对于(*SKIP)我不确定。

如果我没记错的话，应该在最新的 python 正则表达式引擎中添加：https://pypi.python.org/pypi/regex.

您可以在此处找到 (*SKIP)(*FAIL) 语法的文档：http://www.rexegg.com/backtracking-control-verbs.html#skipfail，其中还说它适用于该段落示例中的 Python：

# Python
# if you don't have the regex package, pip install regex

import regex as mrab

# print(regex.__version__) should output 2.4.76 or higher
print(mrab.findall(r'{[^}]*}(*SKIP)(*FAIL)|\b\w+\b',
                   'good words {and bad} {ones}'))
# ['good', 'words']

希望对您有所帮助！

将文件路径的指定标记内的所有斜杠替换为反斜杠

Replace all slash by backslash within specified tag for file path

python

regex

replace

slash

backslash