python 2 doctest 怎么会失败，而失败消息中的值却没有差异？

Question

我在 Windows.

中使用 Python 2.7.9

我有一个 UTF-8 编码的 python 脚本文件，其内容如下：

# coding=utf-8

def test_func():
    u"""
    >>> test_func()
    u'☃'
    """
    return u'☃'

当我运行 doctest:

时，我遇到了一个奇怪的失败

Failed example:
    test_func()
Expected:
    u'\u2603'
Got:
    u'\u2603'

无论是通过 IDE 我通常使用的 (IDEA IntelliJ) 还是从命令行启动 doctests，我都会看到同样的失败输出：

> x:\my_virtualenv\Scripts\python.exe -m doctest -v hello.py

我将 Expected 和 Got 下的行复制到 WinMerge 中，以排除我无法发现的字符中的一些细微差别；它告诉我它们是相同的。

但是，如果我重做命令行运行，但将输出重定向到文本文件，如下所示：

> x:\my_virtualenv\Scripts\python.exe -m doctest -v hello.py > out.txt

测试仍然失败，但最终的失败输出有点不同：

Failed example:
    test_func()
Expected:
    u'☃'
Got:
    u'\u2603'

如果我将转义的 unicode 文字放在我的 doctest 中：

# coding=utf-8

def test_func():
    u"""
    >>> test_func()
    u'☃'
    """
    return u'\u2603'

测试通过。但据我所知，u'\u2603' 和 u'☃' 的计算结果应该相同。

关于失败的案例，我真的有两个问题：

博士给出的陈述之一（在 Expected 或 Got 下）对于博士对该案例的价值是否不正确？（即 x != eval(repr(x))）
如果不是，为什么测试失败？

Answer 1

doctest模块使用difflib来区分结果和预期结果。像下面这样：

>>> import difflib
>>> variation = difflib.unified_diff('x', 'x')
>>> list(variation)
[]
>>> variation = difflib.unified_diff('x', 'y')
>>> list(variation)
['--- \n', '+++ \n', '@@ -1 +1 @@\n', '-x', '+y']

在幕后，doctest 模块多次格式化结果和预期结果。您的问题似乎是由字符串编码引起的解释错误。打印到控制台的内容已经过格式化（使用 %s），因此消除了任何 visible 差异；使它们看起来相同。

Answer 2

只是免费而且也因为工作讨论中没有考虑这种可能性：我有一个弱相似的问题。参见

[...]
Expected:
    <xarray.DataArray ()>
    array(0.0)
    Coordinates:
        d1   |S3 'nat'
        d2   |S3 'dat'
        d3   |S3 'a'        
Got:
    <xarray.DataArray ()>
    array(0.0)
    Coordinates:
        d1   |S3 'nat'
        d2   |S3 'dat'
        d3   |S3 'a'

可以肯定的是，没有人类可见的差异。在我的小案例中，解决方案是确保没有空格！

python 2 doctest 怎么会失败，而失败消息中的值却没有差异？

How can a python 2 doctest fail and yet have no difference in the values in the failure message?

python

unicode

doctest