获取每个给定范围之间的整数数量 - 无需外部包

Question

任务：

Write a function that receives 3 lists and returns an array. The first list contains n integers, their values range between 0 and 10^9. "numbers".
The second list is a low-range list, which contains the lower end of a range, it contains q integers. "low".
The third list is a high-range list, which contains the higher end of a range, it contains q integers. "high".

The function should return a list that contains the number of integers in the first list, that fall in its range, given by the low-range and high-range lists.

In the returned list, at index i, there should be the number of integers in "numbers" which are bigger or equal to low[i] and smaller or equal to high[i].
You can only import math, no other imports are allowed
the list may not be sorted

示例：
count_range([12,13,14,15,17],[14],[14]) 应该 return [1]
count_range([12,13,14,15,17],[14,15],[14,18]) 应该 return [1,2]
count_range([12,13,14,15,17],[12],[17]) 应该 return [5]

这是我的解决方案，但效率不够，我需要一些方法来优化它或以不同的方式解决它，而无需导入任何外部包。

def binarySearch(data, val):
    highIndex = len(data) - 1
    lowIndex = 0
    while highIndex > lowIndex:
        index = math.ceil((highIndex + lowIndex) / 2)
        sub = data[index]
        if sub > val:
            if highIndex == index:
                return sorted([highIndex, lowIndex])
            highIndex = index
        else:
            if lowIndex == index:
                return sorted([highIndex, lowIndex])
            lowIndex = index
    return sorted([highIndex, lowIndex])


def count_range(numbers, low, high):
    numbers.sort()
    result = []
    low_range_dict = {}
    high_range_dict = {}
    for i in range(len(numbers)):
        if numbers[i] not in low_range_dict:
            low_range_dict[numbers[i]] = i
        high_range_dict[numbers[i]] = i
    for i in range(len(low)):
        low_r = low[i]
        high_r = high[i]
        if low_r not in low_range_dict:
            low_range_dict[low_r] = binarySearch(numbers, low_r)[0]
            high_range_dict[low_r] = low_range_dict[low_r]
        low_index = low_range_dict.get(low_r)
        if high_r not in high_range_dict:
            high_range_dict[high_r] = binarySearch(numbers, high_r)[0]
            low_range_dict[high_r] = high_range_dict[high_r]
        high_index = high_range_dict.get(high_r)
        if low_r in numbers or low_r < numbers[0]:
            low_index -= 1
        result.append(high_index - low_index)
    return result

Answer 1

如果我们可以使用标准库中的任何模块，我们就可以编写一个非常简单的解决方案。

from bisect import bisect_left
from functools import lru_cache, partial

def count_range(numbers, lows, highs):
    index = lru_cache()(partial(bisect_left, sorted(numbers)))
    return [index(hi + 1) - index(lo) for (lo, hi) in zip(lows, highs)]

但我们可以编写我们自己的（简化的）等同于 partial、lru_cache 和 bisect_left，因此不需要导入。

它比您的原始代码更简单，并且应该运行更快，但我不知道差异有多大。

我们将使用更简单的二分法函数进行二分查找。而且我们不需要针对高范围和低范围的两个不同的记忆字典。

# This bisect is based on the reference implementation in the standard library.
# in cpython this is actually implemented in C, and is faster.
def bisect_left(a, x):
    """Return the index where to insert item x in list a, assuming a is sorted."""
    lo, hi = 0, len(a)
    while lo < hi:
        mid = (lo + hi) // 2
        if a[mid] < x:
            lo = mid + 1
        else:
            hi = mid
    return lo


def count_range(numbers, lows, highs):
    numbers.sort()
    # instead of both low_range_dict and high_range_dict
    # we only need a single memoization dictionary. 
    # We could also use @functools.cache from the standard library
    memo = {}
    def index(val):
        """Memoized bisect"""
        if not val in memo:
            memo[val] = bisect_left(numbers, val)
        return memo[val]

    return [index(hi + 1) - index(lo) for (lo, hi) in zip(lows, highs)]

获取每个给定范围之间的整数数量 - 无需外部包

Get the number of integers between each given range - without external packages

python

arrays

range

binary-search