Return 所有匹配值排序列表的索引

Return index of all matching values sorted list

我最近在面试中遇到了这个问题,一直无法解决。给定一个排序列表:

li = [1,2,2,3,4,4,4,4,5,5,5...]

Return 匹配目标的所有元素的索引,ex. 4,时间复杂度为 O(log n)。

这个问题的设置告诉我这是一个二进制搜索问题。我用类似下面的内容回答了它,但没能想出更好的答案:

data = [2,3,5,6,8,9,12,12,12,14,17,19,22,25,27,28,33,37]
target = 12

def binary_search(data, target):
    low = 0 
    high = len(data) - 1

    while low <= high:
        mid = (high + low) // 2
        if data[mid] == target:
            return mid
        elif data[mid] > target:
            high = mid - 1
        else:
            low = mid + 1
    
    return False

def return_other_indices(data, target, match_index):
    output = [match_index]

    i = match_index + 1
    while data[i] == target:
        output.append(i)
        i += 1

    i = match_index - 1
    while data[i] == target:
        output.append(i)
        i -= 1

    return output

match_index = binary_search(data, target)
output = return_other_indices(data, target, match_index)
print(output)

有更好的方法吗?

我不确定这是否是您要找的,但这里有一个使用 python 标准库的解决方案。

仅使用 bisect

from bisect import bisect_left, bisect
start = bisect_left(data, target)
stop = bisect(data, target, lo=start)
output = range(start, stop)
print(list(output))

输出:[6,7,8]

较早的答案:

正在使用bisect.bisect_left to find the starting point, and itertools.takewhile获取所有元素。

from bisect import bisect_left
from itertools import takewhile, islice
left = bisect_left(data, target)
[left]+[i[0]+left+1 for i in takewhile(lambda x: x[1]==target,
                                       enumerate(islice(data, left+1, len(data))))]

注意。当预期只有几个目标并且数据很大时,带有 takewhile 的线性版本可能会更快,否则双等分通常会更快

首先,您必须考虑二分查找是什么,以及它是如何工作的。现在应用到你要修改寻找第一个和最后一个的问题。

在此基础上,使用二分法搜索第一个出现的地方,然后是最后一个出现的地方。

通过同时使用两个索引,我可以简单打印区间

之间的数字
def findFirstOccurrence(nums, target):
    (left, right) = (0, len(nums) - 1)
    result = -1
    while left <= right:
        mid = (left + right) // 2
        if target == nums[mid]:
            result = mid
            right = mid - 1
        elif target < nums[mid]:
            right = mid - 1
        else:
            left = mid + 1
    return result
    
def findLastOccurrence(nums, target):
    (left, right) = (0, len(nums) - 1)
    result = -1
    while left <= right:
        mid = (left + right) // 2
        if target == nums[mid]:
            result = mid
            left = mid + 1
        elif target < nums[mid]:
            right = mid - 1
        else:
            left = mid + 1
    return result


if __name__ == '__main__':
 
    nums = [2, 5, 5, 5, 5, 5, 6, 6, 8, 9, 9, 9]
    target = 5
    
    indexFirst = findFirstOccurrence(nums, target)
    indexLast = findLastOccurrence(nums, target)
    
    if(indexLast == -1 or indexFirst == -1):
        print("Element not found")
    else:
        print(*range(indexFirst,indexLast+1))

更新

这里是上面的优化形式:

def optimicedOccurrence(nums, target, type_of_occurrence):
    (left, right) = (0, len(nums) - 1)
    result = -1
    while left <= right:
        
        mid = (left + right) // 2
        if target == nums[mid]:
            result = mid
            if(type_of_occurrence):
                right = mid - 1
            else:
                left = mid + 1
        elif target < nums[mid]:
            right = mid - 1
        else:
            left = mid + 1
    return result


if __name__ == '__main__':
 
    nums = [2, 5, 5, 5, 5, 6, 6, 8, 9, 9, 9]
    target = 5
    
    indexFirst = optimicedOccurrence(nums, target, 1)
    indexLast = optimicedOccurrence(nums, target, 0)
    
    if(indexLast == -1 or indexFirst == -1):
        print("Element not found")
    else:
        print(*range(indexFirst,indexLast+1))

使用对分模块:

from bisect import bisect_left,bisect_right

li = [1,2,2,3,4,4,4,4,5,5,5]

target = 4
print(*range(bisect_left(li,target),bisect_right(li,target)))

4 5 6 7