如何格式化 Python 的 difflib.HtmlDiff 的输出以使其可读?

How can I format the output of Python's difflib.HtmlDiff to make it readable?

我正在尝试使用 Python 2 中的库 difflib 输出两个文本文件之间的差异,并使用函数 HtmlDiff 生成 html 文件。

V1 = 'This has four words'
V2 = 'This has more than four words'

res = difflib.HtmlDiff().make_table(V1, V2)

text_file = open(OUTPUT, "w")
text_file.write(res)
text_file.close()

但是输出 html 在浏览器上看起来像这样:

显示正在比较每个字符,使其完全不可读。

我应该修改什么以使比较更加人性化? (例如每边的完整句子)

如果输入指定 "lines",则输出也根据行进行格式化,但不显示差异:

V1 = ['This has four words']
V2 = ['This has more than four words']

res = difflib.HtmlDiff().make_table(V1, V2)

text_file = open(OUTPUT, "w")
text_file.write(res)
text_file.close()

结果 html(在浏览器上查看):

要获得标记,您可以使用 difflib.SequenceMatcher 作为此答案中定义的函数

获取此代码:

import difflib

def show_diff(seqm):
    # function from 
    """Unify operations between two compared strings
seqm is a difflib.SequenceMatcher instance whose a & b are strings"""
    output= []
    for opcode, a0, a1, b0, b1 in seqm.get_opcodes():
        if opcode == 'equal':
            output.append(seqm.a[a0:a1])
        elif opcode == 'insert':
            output.append("<ins>" + seqm.b[b0:b1] + "</ins>")
        elif opcode == 'delete':
            output.append("<del>" + seqm.a[a0:a1] + "</del>")
        elif opcode == 'replace':
            raise NotImplementedError( "what to do with 'replace' opcode?" )
        else:
            raise RuntimeError( f"unexpected opcode unknown opcode {opcode}" )
    return ''.join(output)


V1 = 'This has four words but fewer than eleven'
V2 = 'This has more than four words'


sm= difflib.SequenceMatcher(None, V1, V2)

html = "<html><body>"+show_diff(sm)+"</body></html>"

open("output.html","wt").write(html)

产生:

问题是您没有所需的样式。尝试使用 make_file 而不是 make_table,然后您会发现有一些 CSS 可以使颜色如您所期望的那样显示。