使用快速排序对列表进行排序时出错 "maximum recursion depth exceeded"

Error "maximum recursion depth exceeded" while sorting a list with Quicksort

我正在尝试使用快速排序算法对几乎已排序的 100,000 个数字列表进行排序,但我收到此错误:

Traceback (most recent call last):
  File "/Users/MacbookPro/Documents/Faculta/alg sortare pyth/quicksort.py", line 48, in <module>
    quickSort(alist)
  File "/Users/MacbookPro/Documents/Faculta/alg sortare pyth/quicksort.py", line 5, in quickSort
    quickSortHelper(alist,0,len(alist)-1)
  File "/Users/MacbookPro/Documents/Faculta/alg sortare pyth/quicksort.py", line 12, in quickSortHelper
    quickSortHelper(alist,first,splitpoint-1)
  File "/Users/MacbookPro/Documents/Faculta/alg sortare pyth/quicksort.py", line 12, in quickSortHelper
    quickSortHelper(alist,first,splitpoint-1)
.
.
(a bunch of the last line)
.
.
  File "/Users/MacbookPro/Documents/Faculta/alg sortare pyth/quicksort.py", line 10, in quickSortHelper
    splitpoint = partition(alist,first,last)
  File "/Users/MacbookPro/Documents/Faculta/alg sortare pyth/quicksort.py", line 25, in partition
    while leftmark <= rightmark and alist[leftmark] <= pivotvalue:
RuntimeError: maximum recursion depth exceeded in cmp

这是我的代码:

from timeit import default_timer as timer
import resource

start = timer()

def quickSort(alist):
   quickSortHelper(alist,0,len(alist)-1)

def quickSortHelper(alist,first,last):
   if first<last:    
       splitpoint = partition(alist,first,last)    
       quickSortHelper(alist,first,splitpoint-1)
       quickSortHelper(alist,splitpoint+1,last)


def partition(alist,first,last):
   pivotvalue = alist[first]    
   leftmark = first+1
   rightmark = last    
   done = False
   while not done:    
       while leftmark <= rightmark and alist[leftmark] <= pivotvalue:
           leftmark = leftmark + 1    
       while alist[rightmark] >= pivotvalue and rightmark >= leftmark:
           rightmark = rightmark -1    
       if rightmark < leftmark:
           done = True
       else:
           temp = alist[leftmark]
           alist[leftmark] = alist[rightmark]
           alist[rightmark] = temp    
   temp = alist[first]
   alist[first] = alist[rightmark]
   alist[rightmark] = temp   
   return rightmark

with open('lista.txt', 'r') as f:
    long_string = f.readline()
    alist = long_string.split(',')
quickSort(alist)
f = open("quick.txt", "w")
print >>f,(alist)
print resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1000
end = timer()
print(end - start)
f.close()
print 'Quick\n'

我之前尝试过对同一个列表进行排序,先是随机打乱顺序,然后就成功了。

当输入已经排序时,您选择的枢轴元素(范围中最左边的元素)将给出“错误”拆分。 “坏”的意思是 partition 将 return 一个 rightmark 等于 first[ 的值=49=].

这意味着在 quickSortHelpern 值的列表将被拆分为零元素列表(第一次递归调用)和 n-1 值的列表(第二次递归调用)。如果输入已经排序,则此模式将在每个递归级别不断重复。因此,如果您的输入大小为 1000,则递归深度将为 1000。Python 保持最大递归深度,在该深度将触发异常。对于大型排序列表,您将遇到此限制。

作为旁注,在那种情况下,您还有最坏情况 运行 时间 O(n²).

如何解决这个问题?

您可以将 Python 配置为 allow deeper recursion, with sys.setrecursionlimit。但这是不可取的,因为这意味着您仍然使用大量堆栈内存,并且不要摆脱最坏情况 运行 输入排序的时间。

解决这个问题的更好方法是在索引 firstlast 之间的某个位置随机选择主元元素,然后交换值在具有最左边值(或最右边,但在您的实现中它将被保留)的索引处。这会将 超出最大递归深度 错误的概率降低到几乎为零。

另请参阅 Wikipedia 如何引用您遇到的问题(我以粗体强调):

Choice of pivot

In the very early versions of quicksort, the leftmost element of the partition would often be chosen as the pivot element. Unfortunately, this causes worst-case behavior on already sorted arrays, which is a rather common use-case. The problem was easily solved by choosing either a random index for the pivot, choosing the middle index of the partition or (especially for longer partitions) choosing the median of the first, middle and last element of the partition for the pivot (as recommended by Sedgewick).

要实现随机选择的数据透视索引,请将这两行添加到 partition 函数的最顶部:

pivotmark = random.randint(first,last)
alist[first], alist[pivotmark] = alist[pivotmark], alist[first]

当然需要import random.

这将解决您遇到的问题。如果您想进一步提高性能,您可能会看看现有的其他几种解决方案,例如上面引用的维基百科文章中提到的那些,看看它们如何处理您正在处理的输入。

注意:另请参阅如何在一行中交换元素...