difflib 输出很奇怪，在每个字符上添加了额外的空格

Question

我在 Python 中尝试使用 difflib，但在使输出看起来不错时遇到了一些困难。出于某种奇怪的原因，difflib 在每个字符前添加了一个空格。例如，我有一个如下所示的文件 (textfile01.txt)：

test text which has no meaning

和textfile02.txt

test text which has no meaning

but looks nice

这是我如何尝试完成比较的小代码示例：

import difflib

handle01 = open(text01.txt , 'r')
handle02 = open(text02.txt , 'r')

d = difflib.ndiff( handle01.read() , handle02.read() )
print "".join(list(diff))

然后，我得到了这个丑陋的输出，看起来...很奇怪：

t e s t t e x t w h i c h h a s n o m e a n i n g-

- b- u- t- - l- o- o- k- s- - n- i- c- e

如您所见，输出看起来很糟糕。我一直在关注我在网上找到的基本 difflib 教程，根据这些教程，输出看起来应该完全不同。我不知道我做错了什么。有什么想法吗？

Answer 1

difflib.ndiff 比较字符串列表，但您将字符串传递给它们——而字符串实际上是一个字符列表。因此，该函数正在逐个字符比较字符串。

>>> list(difflib.ndiff("test", "testa"))
['  t', '  e', '  s', '  t', '+ a']

（从字面上看，您可以通过在其中添加元素 ["a"] 从列表 ["t", "e", "s", "t"] 转到列表 ["t", "e", "s", "t", "a"]。

您想将 read() 更改为 readlines()，这样您就可以按行比较这两个文件，这可能正是您所期望的。

您还想将 "".join(... 更改为 "\n".join(...，以便在屏幕上获得类似 diff 的输出。

>>> list(difflib.ndiff(["test"], ["testa"]))
['- test', '+ testa', '?     +\n']
>>> print "\n".join(_)
- test
+ testa
?     +

（这里的 difflib 非常好，它标记了在 ? 行中添加字符的确切位置。）

difflib 输出很奇怪，在每个字符上添加了额外的空格

difflib output is very strange, adding extra whitespace on each character

python

difflib