使用范围使用自定义排序功能对元组进行排序?

Sorting tuples with a custom sorting function using a range?

我想根据最后两列对元组列表进行排序:

mylist = [(33, 36, 84), 
          (34, 37, 656), 
          (23, 38, 42)]

我知道我可以这样做:

final = sorted(mylist, key:lambda x: [ x[1], x[2]])

现在我的问题是我想将列表的第二列与特殊条件进行比较:如果两个数字之间的差异小于偏移量,则应将它们视为相等(36 == 37 == 38 ) 并且第三列应用于对列表进行排序。我希望看到的最终结果是:

mylist = [(23, 38, 42)
          (33, 36, 84), 
          (34, 37, 656)]

我正在考虑创建自己的整数类型并覆盖等于运算符。这可能吗?是矫枉过正吗?有没有更好的办法解决这个问题?

有时,基于 cmp 函数的 old-style 排序比基于 key 的排序更容易。所以——写一个 cmp 函数,然后用 functools.cmp_to_key 把它转换成一个键:

import functools

def compare(s,t,offset):
    _,y,z = s
    _,u,v = t
    if abs(y-u) > offset: #use 2nd component
        if y < u:
            return -1
        else:
            return 1
    else: #use 3rd component
        if z < v:
            return -1
        elif z == v:
            return 0
        else:
            return 1

mylist = [(33, 36, 84), 
          (34, 37, 656), 
          (23, 38, 42)]

mylist.sort(key = functools.cmp_to_key(lambda s,t: compare(s,t,2)))

for t in mylist: print(t)

输出:

(23, 38, 42)
(33, 36, 84)
(34, 37, 656)

https://wiki.python.org/moin/HowTo/Sorting 中查找 "The Old Way Using the cmp Parameter"。这允许您编写自己的比较函数,而不仅仅是设置键和使用比较运算符。

像这样进行排序是有危险的。查找 "strict weak ordering." 您可能有多个不同的有效顺序。这可能会破坏其他假定存在一种正确的排序方式的代码。

现在正式回答你的问题:

mylist = [(33, 36, 84), 
          (34, 37, 656), 
          (23, 38, 42)]

def custom_sort_term(x, y, offset = 2):
    if abs(x-y) <= offset:
        return 0
    return x-y

def custom_sort_function(x, y):
    x1 = x[1]
    y1 = y[1]
    first_comparison_result = custom_sort_term(x1, y1)
    if (first_comparison_result):
        return first_comparison_result
    x2 = x[2]
    y2 = y[2]
    return custom_sort_term(x2, y2)

final = sorted(mylist, cmp=custom_sort_function)
print final

[(23, 38, 42), (33, 36, 84), (34, 37, 656)]

我认为最简单的方法是创建一个新的 class 比较像你想要的那样:

mylist = [(33, 36, 84), 
          (34, 37, 656), 
          (23, 38, 42)]

offset = 2

class Comp(object):
    def __init__(self, tup):
        self.tup = tup

    def __lt__(self, other):  # sorted works even if only __lt__ is implemented.
        # If the difference is less or equal the offset of the second item compare the third
        if abs(self.tup[1] - other.tup[1]) <= offset:
            return self.tup[2] < other.tup[2]
        # otherwise compare them as usual
        else:
            return (self.tup[1], self.tup[2]) < (other.tup[1], other.tup[2])

示例 运行 显示了您的预期结果:

>>> sorted(mylist, key=Comp)
[(23, 38, 42), (33, 36, 84), (34, 37, 656)]

我认为它比使用 functools.cmp_to_key 更干净一些,但这是个人喜好问题。

不太好,但我在解释 OP 的问题陈述时尽量做到笼统

我扩展了测试用例,然后应用了简单的钝器

# test case expanded
mylist = [(33, 6, 104),
          (31, 36, 84),
          (35, 86, 84),
          (30, 9, 4),
          (23, 38, 42),
          (34, 37, 656),          
          (33, 88, 8)]

threshld = 2    # different final output can be seen if changed to 1, 3, 30

def collapse(nums, threshld):
    """
    takes sorted (increasing) list of numbers, nums
    replaces runs of consequetive nums
    that successively differ by threshld or less
    with 1st number in each run
    """
    cnums = nums[:]
    cur = nums[0]
    for i in range(len(nums)-1):
        if (nums[i+1] - nums[i]) <= threshld:
            cnums[i+1] = cur
        else:
            cur = cnums[i+1]
    return cnums

mylists = [list(i) for i in mylist] # change the tuples to lists to modify

indxd=[e + [i] for i, e in enumerate(mylists)] # append the original indexing    
#print(*indxd, sep='\n')
im0 = sorted(indxd, key=lambda x: [ x[1]])      # sort by  middle number   
cns = collapse([i[1] for i in im0], threshld)   # then collapse()
#print(cns)
for i in range(len(im0)):                       # overwrite collapsed into im0
    im0[i][1] = cns[i]   
#print(*im0, sep='\n')
im1 = sorted(im0, key=lambda x: [ x[1], x[2]])  # now do 2 level sort
#print(*sorted(im0, key=lambda x: [ x[1], x[2]]), sep='\n')
final = [mylist[im1[i][3]] for i in range(len(im1))] # rebuid using new order 
                                                     # of original indices       
print(*final, sep='\n')

(33, 6, 104)
(30, 9, 4)
(23, 38, 42)
(31, 36, 84)
(34, 37, 656)
(33, 88, 8)
(35, 86, 84)