Python: functools cmp_to_key 函数是如何工作的？

Question

在Python中，list.sort方法和sorted内置函数都接受一个名为key的可选参数，这是一个函数，给定一个来自该列表 return 是它的排序键。

旧的 Python 版本使用了不同的方法，使用 cmp 参数代替，这是一个函数，给定列表中的两个元素 returns 如果第一个是负数小于第二个，如果相等则为零，如果第一个更大则为正数。在某些时候，此参数已被弃用并且未包含在 Python 3.

中

前几天，我想以一种 cmp 函数比 key 函数更容易编写的方式对元素列表进行排序。我不想使用已弃用的功能，所以我阅读了文档，发现 functools 模块中有一个名为 cmp_to_key 的函数，正如他的名字所述，它接收到一个 cmp 函数和 return 一个 key 一个......或者这就是我的想法，直到我阅读了 docs 中包含的这个高级函数的源代码（或至少是等效版本）

def cmp_to_key(mycmp):
    'Convert a cmp= function into a key= function'
    class K(object):
        def __init__(self, obj, *args):
            self.obj = obj
        def __lt__(self, other):
            return mycmp(self.obj, other.obj) < 0
        def __gt__(self, other):
            return mycmp(self.obj, other.obj) > 0
        def __eq__(self, other):
            return mycmp(self.obj, other.obj) == 0
        def __le__(self, other):
            return mycmp(self.obj, other.obj) <= 0
        def __ge__(self, other):
            return mycmp(self.obj, other.obj) >= 0
        def __ne__(self, other):
            return mycmp(self.obj, other.obj) != 0
    return K

尽管 cmp_to_key 按预期工作，但令我惊讶的是这个函数不是 return 函数而是 K class .为什么？它是如何工作的？我猜 sorted 函数在内部检查 cmp 是函数还是 K class 或类似的东西，但我不确定。

P.S.: 尽管这很奇怪，但我发现 K class 非常有用。检查此代码：

from functools import cmp_to_key

def my_cmp(a, b):
    # some sorting comparison which is hard to express using a key function

class MyClass(cmp_to_key(my_cmp)):
    ...

这样，MyClass 的任何实例列表都可以默认按 my_cmp

中定义的标准排序

Answer 1

我刚刚意识到，尽管 K class 不是函数，但它是可调用的，因为它是 class！ classes 是可调用对象，当被调用时，它会创建一个新实例，通过调用相应的 __init__ 然后 returns 该实例对其进行初始化。

这样它的行为就像一个 key 函数，因为 K 在调用时接收对象，并将该对象包装在 K 实例中，该实例可以与其他 K 实例进行比较。

如果我错了请纠正我。我觉得我正在进入一个我不熟悉的元classes 领域。

Answer 2

不，sorted 函数（或 list.sort）在内部不需要检查它接收到的对象是函数还是 class。它只关心它在 key 参数中收到的对象应该是可调用的，并且应该 return 一个可以在调用时与其他值进行比较的值。

类也是可调用的，当您调用 class 时，您会收到 class 的实例。

要回答你的问题，首先我们需要了解（至少在基本层面上）key 论点是如何工作的 -

为每个元素调用 key 可调用函数，它接收回应该排序的对象。
接收到新对象后，它将此对象与其他对象进行比较（通过使用其他元素调用 key 可调用对象再次接收）。

这里要注意的重要一点是，将收到的新 object 与其他相同对象进行比较。

现在进入等效代码，当您创建 class 的实例时，可以使用 mycmp 函数将其与同一 class 的其他实例进行比较。并在对值进行排序时对这些对象进行排序（实际上）调用您的 mycmp() 函数来确定该值是小于还是大于其他对象。

带有打印语句的示例 -

>>> def cmp_to_key(mycmp):
...     'Convert a cmp= function into a key= function'
...     class K(object):
...         def __init__(self, obj, *args):
...             print('obj created with ',obj)
...             self.obj = obj
...         def __lt__(self, other):
...             print('comparing less than ',self.obj)
...             return mycmp(self.obj, other.obj) < 0
...         def __gt__(self, other):
...             print('comparing greter than ',self.obj)
...             return mycmp(self.obj, other.obj) > 0
...         def __eq__(self, other):
...             print('comparing equal to ',self.obj)
...             return mycmp(self.obj, other.obj) == 0
...         def __le__(self, other):
...             print('comparing less than equal ',self.obj)
...             return mycmp(self.obj, other.obj) <= 0
...         def __ge__(self, other):
...             print('comparing greater than equal',self.obj)
...             return mycmp(self.obj, other.obj) >= 0
...         def __ne__(self, other):
...             print('comparing not equal ',self.obj)
...             return mycmp(self.obj, other.obj) != 0
...     return K
...
>>> def mycmp(a, b):
...     print("In Mycmp for", a, ' ', b)
...     if a < b:
...         return -1
...     elif a > b:
...         return 1
...     return 0
...
>>> print(sorted([3,4,2,5],key=cmp_to_key(mycmp)))
obj created with  3
obj created with  4
obj created with  2
obj created with  5
comparing less than  4
In Mycmp for 4   3
comparing less than  2
In Mycmp for 2   4
comparing less than  2
In Mycmp for 2   4
comparing less than  2
In Mycmp for 2   3
comparing less than  5
In Mycmp for 5   3
comparing less than  5
In Mycmp for 5   4
[2, 3, 4, 5]

Answer 3

我没有查看源代码，但我相信 key 函数的结果也可以是任何东西，因此也是一个可比较的对象。 cmp_to_key 只是屏蔽了那些 K 个对象的创建，这些对象在排序时相互比较。

如果我尝试像这样对部门和反向房间号进行排序：

departments_and_rooms = [('a', 1), ('a', 3),('b', 2)]
departments_and_rooms.sort(key=lambda vs: vs[0])
departments_and_rooms.sort(key=lambda vs: vs[1], reverse=True)
departments_and_rooms # is now [('a', 3), ('b', 2), ('a', 1)]

这不是我想要的，我认为排序仅在每次调用时稳定，documentation 误导了我：

The sort() method is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal — this is helpful for sorting in multiple passes (for example, sort by department, then by salary grade).

旧式方法之所以有效，是因为每个结果调用 K class returns 一个 K 实例并与 mycmp:

的结果进行比较

def mycmp(a, b):                             
    return cmp((a[0], -a[1]), (b[0], -b[1]))

departments_and_rooms = [('a', 1), ('a', 3),('b', 2)]
departments_and_rooms.sort(key=cmp_to_key(mycmp))
departments_and_rooms # is now [('a', 3), ('a', 1), ('b', 2)]

这是一个重要的区别，开箱即用不能进行多次传递。关键函数的 values/results 必须是可排序的相对顺序，而不是要排序的元素。因此是 cmp_to_key 掩码：创建那些需要对其进行排序的可比较对象。

希望对您有所帮助。并感谢 cmp_to_key 代码中的洞察力，也对我有很大帮助:)

Python: functools cmp_to_key 函数是如何工作的？

Python: how does the functools cmp_to_key function works?

python

sorting

deprecated