中位数的中位数 select python

median of medians select python

我正在实施 Select 算法(a.k.a。确定性 Select)。我已经让它适用于小 arrays/lists,但是当我的数组大小超过 26 时,它会因以下错误而中断:"RuntimeError: maximum recursion depth exceeded"。对于大小为 25 及以下的数组,没有问题。

我的最终目标是 运行 用于大小为 500 的数组并进行多次迭代。迭代不是问题。我已经研究过 Whosebug 并看过文章:Python implementation of "median of medians" algorithm 等。我有一种预感,随机生成的数组中的重复项可能导致了问题,但事实并非如此。

这是我的代码:

import math
import random

# Insertion Sort Khan Academy video: https://www.youtube.com/watch?v=6pyeMmJTefg&list=PL36E7A2B75028A3D6&index=22

def insertion_sort(A):  # Sorting it in place
    for index in range(1, len(A)):# range is up to but not including len(A)
      value = A[index]
      i = index - 1           # index of the item that is directly to the left
      while i >= 0:
        if value < A[i]:
          A[i + 1] = A[i]
          A[i] = value
          i = i - 1
        else:
          break

timeslo = 0  # I think that this is a global variable

def partition(A, p):
  global timeslo
  hi = [] #hold things larger than our pivot
  lo = [] #  "     "   smaller  "   "   "
  for x in A:       # walk through all the elements in the Array A.
    if x <p:
      lo = lo + [x]
      timeslo = timeslo + 1  #keep track no. of comparisons
    else:
      hi = hi + [x]
  return lo,hi,timeslo

def get_chunks(Acopy, n):
                                    # Declare some empty lists to hold our chunks
  chunk = []
  chunks = []
                                    # Step through the array n element at a time
  for x in range(0, len(Acopy), n): # stepping by size n starting at the beginning
                                    # of the array
    chunk = Acopy[x:x+n]            # Extract 5 elements                           
                                    # sort chunk and find its median
    insertion_sort(chunk) # in place sort of chunk of size 5
    # get the median ... (i.e. the middle element)
    # Add them to list



 mindex = (len(chunk)-1)/2  # pick middle index each time

    chunks.append(chunk[mindex]) 
#     chunks.append(chunk)                        # assuming subarrays are size 5 and we want the middle
                                                  # this caused some trouble because not all subarrays were size 5
                            # index which is 2.
  return chunks


def Select(A, k): 

  if (len(A) == 1):  # if the array is size 1 then just return the one and only element
    return A[0]
  elif (len(A) <= 5): # if length is 5 or less, sort it and return the kth smallest element
    insertion_sort(A)
    return A[k-1]
  else:
    M = get_chunks(A, 5)  # this will give you the array of medians,,, don't sort it....WHY ???



    m = len(M)           # m is the size of the array of Medians M.

    x  = Select(M, m/2)# m/2 is the same as len(A)/10  FYI

    lo, hi, timeslo = partition(A, x) 

    rank = len(lo) + 1

    if rank == k: # we're in the middle -- we're done
      return x, timeslo    # return the value of the kth smallest element
    elif k < rank:
      return Select(lo, k) # ???????????????
    else:
      return Select(hi, k-rank)

################### TROUBLESHOOTING   ################################
#   Works with arrays of size 25 and 5000 iterations
#   Doesn't work with     "   26 and 5000    "
#
#  arrays of size 26 and 20 iterations breaks it    ?????????????????

# A = []
Total = 0
n = input('What size of array of random #s do you want?: ')
N = input('number of iterations: ')

# n = 26
# N = 1

for x in range(0, N):
  A = random.sample(range(1,1000), n)  # make an array or list of size n
  result = Select(A, 2)      #p is the median of the medians, 2 means the 3rd smallest element
  Total = Total + timeslo             # the total number of comparisons made
print("the result is"), result
print("timeslo = "), timeslo
print("# of comparisons = "), Total

# A = [7, 1, 3, 5, 9, 2, 83, 8, 4, 13, 17, 21, 16, 11, 77, 33, 55, 44, 66, 88, 111, 222]
# result = Select(A, 2)  
# print("Result = "), result  

如有任何帮助,我们将不胜感激。

更改此行
return x, timeslo # return the value of the kth smallest element
进入
return x # return the value of the kth smallest element

最后打印出来可以得到timeslo。返回 xtimeslo 是不正确的,因为它将在 partition(A, p) 中用于拆分数组,其中参数 p 应该是前面语句的中位数 x = Select(M, m/2)