为什么成功的 assertEqual 并不总是意味着成功的 assertItemsEqual?

Why does a successful assertEqual not always imply a successful assertItemsEqual?

Python 2.7 docs 表示 assertItemsEqual "is the equivalent of assertEqual(sorted(expected), sorted(actual))"。在下面的示例中,除了 test4 之外的所有测试都通过了。为什么 assertItemsEqual 在这种情况下会失败?

根据最小惊讶原则,给定两个可迭代对象,我希望成功的 assertEqual 意味着成功的 assertItemsEqual

import unittest

class foo(object):
    def __init__(self, a):
        self.a = a

    def __eq__(self, other):
        return self.a == other.a

class test(unittest.TestCase):
    def setUp(self):
        self.list1 = [foo(1), foo(2)]
        self.list2 = [foo(1), foo(2)]

    def test1(self):
        self.assertTrue(self.list1 == self.list2)

    def test2(self):
        self.assertEqual(self.list1, self.list2)

    def test3(self):
        self.assertEqual(sorted(self.list1), sorted(self.list2))

    def test4(self):
        self.assertItemsEqual(self.list1, self.list2)

if __name__=='__main__':
    unittest.main()

这是我机器上的输出:

FAIL: test4 (__main__.test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "assert_test.py", line 25, in test4
    self.assertItemsEqual(self.list1, self.list2)
AssertionError: Element counts were not equal:
First has 1, Second has 0:  <__main__.foo object at 0x7f67b3ce2590>
First has 1, Second has 0:  <__main__.foo object at 0x7f67b3ce25d0>
First has 0, Second has 1:  <__main__.foo object at 0x7f67b3ce2610>
First has 0, Second has 1:  <__main__.foo object at 0x7f67b3ce2650>

----------------------------------------------------------------------
Ran 4 tests in 0.001s

FAILED (failures=1)

文档的相关部分在这里:

https://docs.python.org/2/reference/expressions.html?highlight=ordering#not-in

Most other objects of built-in types compare unequal unless they are the same object; the choice whether one object is considered smaller or larger than another one is made arbitrarily but consistently within one execution of a program.

因此,如果您制作 x, y = foo(1), foo(1),那么排序最终是 x > y 还是 x < y 并不确定。在 python3 中你根本不会被允许, sorted 调用应该引发异常。

由于 unittest 为每个测试方法调用 setUp,因此每次都会创建不同的 foo 个实例。


assertItemsEqual是用collections.Counter(dict的子类)实现的,所以我认为test4的失败可能是这个事实的一个症状:

>>> x, y = foo(1), foo(1)
>>> x == y
True
>>> {x: None} == {y: None}
False

如果两个项目比较相等,那么它们的哈希值应该相同,否则你可能会破坏这样的映射。

有趣的是,文档规范与实现分离,它从不进行任何排序。 Here is the source code. As you can see, it first tries to count by hashing using collections.Counter. If this fails with a type error (because either list contains an item that's unhashable), it moves on to a second algorithm,它使用 python == 和 O(n^2) 循环进行比较。

因此,如果您的 foo class 不可散列,则第二种算法会发出匹配信号。但它是完全可散列的。来自文档:

Objects which are instances of user-defined classes are hashable by default; they all compare unequal (except with themselves), and their hash value is derived from their id().

我通过调用 collections.Counter([foo(1)]) 验证了这一点。没有类型错误异常。

所以这里是您的代码从 rails 中分离出来的地方。来自 __hash__ 上的文档:

if it defines cmp() or eq() but not hash(), its instances will not be usable in hashed collections.

不幸的是"not usable"显然不等于"unhashable."

接着说:

Classes which inherit a hash() method from a parent class but change the meaning of cmp() or eq() such that the hash value returned is no longer appropriate (e.g. by switching to a value-based concept of equality instead of the default identity based equality) can explicitly flag themselves as being unhashable by setting hash = None in the class definition.

如果我们重新定义:

class foo(object):
    __hash__ = None
    def __init__(self, a):
        self.a = a
    def __eq__(self, other):
        return isinstance(other, foo) and self.a == other.a

所有测试都通过了!

所以看起来这些文件并没有完全错误,但它们也不是很清楚。他们应该提到计数是通过散列完成的,只有在失败时才会尝试简单的相等匹配。只有当对象具有完整的散列语义或完全不可散列时,这才是有效的方法。你的处于中间立场。 (我相信 Python 3 在禁止或至少警告这种类型的 class 方面更加严格。)