为什么我会得到带有 difflib.ndiff 的杂散元素？

Question

最小工作示例：

In [3]: a = ('r1', 'r2', 'r11', 'r6', 'r1', 'r2', 'r7', 'r8')                                                                                           

In [4]: b = ('r1', 'r2', 'r1', 'r6', 'r1', 'r2', 'r7', 'r8')                                                                                            

In [5]: list(difflib.ndiff(a, b))                                                                                                                       
Out[5]: 
['  r1',
 '  r2',
 '- r11',
 '?   -\n',
 '+ r1',
 '  r6',
 '  r1',
 '  r2',
 '  r7',
 '  r8']

谁能解释一下为什么输出列表中的第四个元素是换行符？我该怎么做才能不将该元素作为 ndiff 输出，而只获取列表的其余部分？

Answer 1

来自documentation：

Lines beginning with ‘?’ attempt to guide the eye to intraline differences, and were not present in either input sequence. These lines can be confusing if the sequences contain tab characters.

这基本上是人为的差异。似乎与r1和r11相似度有关；将 a 中的元素更改为 r7 摆脱了 ? 差异。不过，我不确定 "similar" 是什么。

Answer 2

因为 ndiff 期望您传入的行以换行符结尾，如下所示：

a = ('r1\n', 'r2\n', 'r11\n', 'r6\n', 'r1\n', 'r2\n', 'r7\n', 'r8\n')
b = ('r1\n', 'r2\n', 'r1\n', 'r6\n', 'r1\n', 'r2\n', 'r7\n', 'r8\n')

在 difflib.Differ.compare, which is what .ndiff() calls under the hood 的文档中，我们看到了这个（强调我的）：

compare(a, b)
Compare two sequences of lines, and generate the delta (a sequence of lines).

Each sequence must contain individual single-line strings ending with newlines. Such sequences can be obtained from the readlines() method of file-like objects. The delta generated also consists of newline-terminated strings, ready to be printed as-is via the writelines() method of a file-like object.

您得到的输出是有意义的，以 ? 开头的行用于突出显示更改的内容。在本例中，它在 r11 中的第二个 1 下方绘制了一个 - 以向您显示它已被删除。 difflib 期待您使用这样的输出

print(''.join(difflib.ndiff(a, b)))

所以它需要用换行符结束它添加的任何行。

您可以使用列表理解将换行符添加到您的原始值

a = [line + "\n" for line in a]
b = [line + "\n" for line in b]

Answer 3

对于任何试图找出确切问题的人：这是发生的事情：

因此，ndiff 试图为您提供更多信息：您可以通过删除最后一个字符 1 将 r11 转换为 r1。注意：_ 用于复制上一个命令的输出。

为什么我会得到带有 difflib.ndiff 的杂散元素？

Why do I get a stray element with difflib.ndiff?

python

difflib

python-3.x