如何反转 wdiff 的输出?

How can I invert the output of wdiff?

给定两个文件 old.txtnew.txt 生成,例如通过命令行

echo "This is not what I want." > old.txt
echo "This is what I want!" > new.txt

我可以运行 wdiff生成字差文件:

wdiff old.txt new.txt > diff.txt

cat diff.txt 一起阅读会给我:

This is [-not-] what I [-want.-] {+want!+}

只从diff.txt开始解析,如何恢复"original"old.txtnew.txt的内容?

(这应该是原则上总是可能因为wdiff似乎保留了所有的文本信息"old" 和 "edited" 文本文件,参见 this gist 另一个例子)

一个选项是使用正则表达式构建一个简单的(例如 Python)解析器:

import re

def get_edited(diff):
  diff = re.sub('\[\-(.*?)\-\]', '', diff)
  edited = re.sub('\{\+(.*?)\+\}', '\1', diff)
  return edited

def get_original(diff):
  diff = re.sub('\[\-(.*?)\-\]', '\1', diff)
  original = re.sub('\{\+(.*?)\+\}', '', diff)
  return original

但是如果有一个内置的方法来做到这一点就好了。有什么建议吗?

正则表达式似乎是正确的选择。
有关 GitHub 的示例,请参阅 https://github.com/snukky/wikiedits/blob/master/bin/wdiff_to_parallel.py