使用queue.PriorityQueue,不关心比较

Using queue.PriorityQueue, not caring about comparisons

我正在尝试在 Python 3(.6) 中使用 queue.PriorityQueue

我想存储具有给定优先级的对象。但如果两个对象具有相同的优先级,我也不介意 PriorityQueue.get 到 return。换句话说,我的对象不能以整数进行比较,允许它们比较没有意义,我只关心优先级。

Python 3.7's documentation中,有一个解决方案涉及dataclasses。我引用:

If the data elements are not comparable, the data can be wrapped in a class that ignores the data item and only compares the priority number:

from dataclasses import dataclass, field
from typing import Any

@dataclass(order=True)
class PrioritizedItem:
    priority: int
    item: Any=field(compare=False)

唉,我用的是 Python 3.6。在 the documentation of this version of Python 中,没有关于使用 PriorityQueue 作为优先级的评论,也不关心 "object value" 这在我的情况下是不合逻辑的。

有没有比在我的自定义 class 上定义 __le__ 和其他比较方法更好的方法?我发现这个解决方案特别丑陋且违反直觉,但这可能就是我。

请参阅 priority queue implementation notes - 在您引用的部分之前(关于使用 dataclasses),它会告诉您如何操作 whitout 他们:

... is to store entries as 3-element list including the priority, an entry count, and the task. The entry count serves as a tie-breaker so that two tasks with the same priority are returned in the order they were added. And since no two entry counts are the same, the tuple comparison will never attempt to directly compare two tasks.

因此,在添加到队列时,只需将您的项目添加为元组 (Prio, Count, YourElem) 中的 3rd 元素即可。

人为的例子:

from queue import PriorityQueue

class CompareError(ValueError): pass

class O:
    def __init__(self,n):
        self.n = n

    def __lq__(self):
        raise CompareError

    def __repr__(self): return str(self)
    def __str__(self): return self.n

def add(prioqueue,prio,item):
    """Adds the 'item' with 'prio' to the 'priorqueue' adding a unique value that
    is stored as member of this method 'add.n' which is incremented on each usage."""
    prioqueue.put( (prio, add.n, item))
    add.n += 1

# no len() on PrioQueue - we ensure our unique integer via method-param
# if you forget to declare this, you get an AttributeError
add.n = 0

h = PriorityQueue()

add(h, 7, O('release product'))
add(h, 1, O('write spec 3'))
add(h, 1, O('write spec 2'))
add(h, 1, O('write spec 1'))
add(h, 3, O('create tests'))

for _ in range(4):
    item = h.get()
    print(item)

使用 h.put( (1, O('write spec 1')) ) 导致

TypeError: '<' not supported between instances of 'O' and 'int'`

使用 def add(prioqueue,prio,item): 将三元组推送为保证不同的第二个值的项目,因此我们的 O()-实例永远不会用作决胜局。

输出:

(1, 2, write spec 3)
(1, 3, write spec 2)
(1, 4, write spec 1)
(3, 5, create tests)

请参阅 MartijnPieters answer 以获得更好的独特第二个元素。

假设我们不想编写具有与 dataclass 等效功能的装饰器。问题是我们不想定义 all 比较运算符以使我们的自定义 class 基于优先级进行比较。 @functools.total_ordering 装饰器可以提供帮助。摘录:

Given a class defining one or more rich comparison ordering methods, this class decorator supplies the rest. This simplifies the effort involved in specifying all of the possible rich comparison operations:

The class must define one of __lt__(), __le__(), __gt__(), or __ge__(). In addition, the class should supply an __eq__() method.

使用提供的示例:

from functools import total_ordering

@total_ordering
class PrioritizedItem:
    # ...

    def __eq__(self, other):
        return self.priority == other.priority

    def __lt__(self, other):
        return self.priority < other.priority

您只需要一个实现 __lt__ 的包装器 class,以便 PriorityQueue 正常工作。这是注释 here:

The sort routines are guaranteed to use __lt__() when making comparisons between two objects. So, it is easy to add a standard sort order to a class by defining an __lt__() method

就这么简单

class PriorityElem:
    def __init__(self, elem_to_wrap):
        self.wrapped_elem = elem_to_wrap

    def __lt__(self, other):
        return self.wrapped_elem.priority < other.wrapped_elem.priority

如果您的元素没有优先级,那么它很简单:

class PriorityElem:
    def __init__(self, elem_to_wrap, priority):
        self.wrapped_elem = elem_to_wrap
        self.priority = other.priority

    def __lt__(self, other):
        return self.priority <  other.priority

现在您可以像这样使用PriorityQueue

queue = PriorityQueue()
queue.put(PriorityElem(my_custom_class1, 10))
queue.put(PriorityElem(my_custom_class2, 10))
queue.put(PriorityElem(my_custom_class3, 30))

first_returned_elem = queue.get()
# first_returned_elem is PriorityElem(my_custom_class1, 10)
second_returned_elem = queue.get()
# second_returned_elem is PriorityElem(my_custom_class2, 10)
third_returned_elem = queue.get()
# third_returned_elem is PriorityElem(my_custom_class3, 30)

在这种情况下获取原始元素就像

一样简单
elem = queue.get().wrapped_elem

因为您不关心排序稳定性,所以您只需要它。

编辑:如评论和 confirmed here 所述,heappush 不稳定:

unlike sorted(), this implementation is not stable.

dataclasses 只是一种避免必须创建大量样板代码的便捷方法。

您实际上没有创建class。也具有唯一计数器值的元组:

from itertools import count

unique = count()

q.put((priority, next(unique), item))

因此,相同优先级之间的关系被后面的整数打破;因为它始终是唯一的,所以永远不会咨询 item 值。

您还可以使用直接丰富的比较方法创建 class,使用 @functools.total_ordering 更简单:

from functools import total_ordering

@total_ordering
class PrioritizedItem:
    def __init__(self, priority, item):
        self.priority = priority
        self.item = item

    def __eq__(self, other):
        if not isinstance(other, __class__):
            return NotImplemented
        return self.priority == other.priority

    def __lt__(self, other):
        if not isinstance(other, __class__):
            return NotImplemented
        return self.priority < other.priority