以索引形式返回结果的二进制搜索

Question

作为课程（Coursera 上的算法工具箱）的一部分，我正在 Python 中实现二分搜索。

挑战在于创建二进制搜索的实现，return是数组中查询（整数）的索引。因此，例如，如果我使用数组 [1,5,7] 调用二分搜索，而我的查询是 5，则二分搜索应该 return 1。如果目标不在数组中，那么它应该 return-1。比如我给它[1, 67, 88, 200]，我的目标是999，那么它应该return -1.

此问题假设所有测试用例都向其提供已排序且没有重复的数组。

我当前的实现使用辅助函数来实际执行二进制搜索。这个助手 return 要么是目标值本身（如果可以找到），要么是 -1（如果找不到）。调用 helper 的 main 函数获取结果，如果它不是 -1，它会在我在 main 函数开始时创建的字典中查找原始索引。

在我自己的私人测试用例中，代码的运行结果正确，但是在提交时，我被告知代码花费的时间太长运行被认为是可接受的。

所以我的请求是这样的：请检查我的代码，请帮我弄清楚如何更有效地将它更改为运行。我的代码贴在下面：

def single_binary_search(keys, target): 
    '''
    Takes an array of integers, keys, and searches for a value, target, in the array.  If the target is found to be 
    within the array, it returns the target value.  Else, it returns -1 if the array is considered invalid or if the
    target value does not exist within the keys array.  
    
    Input: 
        keys: an array of integers sorted in increasing order 
        target: an integer we wish to ascertain is within keys  
    Returns:
        the target value if it is located, or -1 if it is not or if the array is invalid 
    '''
    start = 0 # define start index
    end = len(keys)-1 # define end index
    if end < start: # check if array is invalid or if target is not in keys.  If so, return -1. 
        return -1 
    mp = start + (end-start)//2 # calculate the midpoint index
    if target == keys[mp]: # check to see if the target is at the midpoint.  
        return keys[mp] # return the target value if located in keys
    elif target < keys[mp]: # if target is less than mp, recursively call binary search on the lower array 
        return single_binary_search(keys[:mp], target)
    else:  # if target is greater than the mp, recursively call binary search on the upper array
        return single_binary_search(keys[mp+1:], target)

def binary_search(keys, query): 
    keys_dictionary = {k: v for v, k in enumerate(keys)} # Create a dictionary of keys to track the indices 
    result = single_binary_search(keys, query) # search for the individual query in keys and store the result 
    if result != -1: # if we found the target, use the keys dictionary to look up the index and return it 
        return keys_dictionary[result]
    else: # if the target was not found or the keys array was invalid, return -1
        return result

早期的实现试图向 single_binary_search() 函数添加两个新参数 start = 0, end = None，但我放弃了它，因为我一直运行ning 进入我知道如何进行的无休止的递归仅通过添加特定的 if 条件来解决异常问题。

Answer 1

这是一个更传统的二进制搜索（另请参阅 Wikipedia），效果会更好：

def binary_search(keys, query): 
    L, R = 0, len(keys) - 1
    while L <= R:
        M = L + (R - L) // 2
        if keys[M] == query:
            return M
        elif query < keys[M]:
            R = M - 1
        else:
            L = M + 1
    return -1

print("keys:", list(range(0,20,2)))
print(14, binary_search(list(range(0,20,2)), 14))
print(15, binary_search(list(range(0,20,2)), 15))
print(-1, binary_search(list(range(0,20,2)), -1))
print(20, binary_search(list(range(0,20,2)), 20))
print(0, binary_search(list(range(0,20,2)), 0))
print(18, binary_search(list(range(0,20,2)), 18))

测试用例输出：

keys: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
14 7
15 -1
-1 -1
20 -1
0 0
18 9

您实施中的一些问题是切片很昂贵并且最好避免，而且您的字典机制似乎没有必要，因为二进制搜索对正在搜索的数组的索引进行操作，因此您不需要这样做获取匹配索引（如果有）的任何额外工作。

UPDATE：这是一个同样有效的递归方法：

def binary_search(keys, query, L = 0, R = None): 
    R = R if R is not None else len(keys) - 1
    M = L + (R - L) // 2
    if L > R:
        return -1
    elif keys[M] == query:
        return M
    elif query < keys[M]:
        R = M - 1
    else:
        L = M + 1
    return binary_search(keys, query, L, R)

Answer 2

创建参数键[:mp] 是一个切片操作，它创建一个副本并且花费的时间与副本的大小成线性关系。所以你的算法实际上需要时间n/2+n/4+n/8+…=O(n)。如果您仅传递新子数组的第一个和最后一个元素的索引，则该算法将花费对数时间。您可以使用列表并将索引作为参数传递。

以索引形式返回结果的二进制搜索

Binary Search with results returned as Indices

python

algorithm

performance

search

binary-search