在排序列表中查找 nearest/closest 值

Find the nearest/closest value in a sorted List

我想知道是否可以在已排序的 List 中为列表中不存在的 元素找到最接近的元素

例如,如果我们有值 [1,3,6,7] 并且我们正在寻找最接近 4 的元素,它应该 return 3,因为 3 是数组中最大的数字, 小于 4.

我希望这是有道理的,因为英语不是我的母语。

你需要 Array.binarySearch, docs.

Returns: index of the search key, if it is contained in the array; otherwise, (-(insertion point) - 1). The insertion point is defined as the point at which the key would be inserted into the array: the index of the first element greater than the key, or a.length if all elements in the array are less than the specified key.

看起来最简单的方法就是遍历排序列表,检查每个项目。

List<Integer> ints = new ArrayList<>();
ints.add(1);
ints.add(3);
ints.add(6);
ints.add(7);

Collections.sort(ints);

int target = 4;
int nearest = 0;

for (int i : ints)
{
    if (i <= target) {
        nearest = i;
    }
}

System.out.println(nearest);

这会输出列表中小于或等于 target 的最大项。

因为集合是排序的,所以可以在O( log n )中做一个修改后的二分查找:

    public static int search(int value, int[] a) {

        if(value < a[0]) {
            return a[0];
        }
        if(value > a[a.length-1]) {
            return a[a.length-1];
        }

        int lo = 0;
        int hi = a.length - 1;

        while (lo <= hi) {
            int mid = (hi + lo) / 2;

            if (value < a[mid]) {
                hi = mid - 1;
            } else if (value > a[mid]) {
                lo = mid + 1;
            } else {
                return a[mid];
            }
        }
        // lo == hi + 1
        return (a[lo] - value) < (value - a[hi]) ? a[lo] : a[hi];
    }

由于上面的大部分代码都是二进制搜索,您可以利用 std 库中提供的 binarySearch(...) 并检查 insertion point:

的值
    public static int usingBinarySearch(int value, int[] a) {
        if (value <= a[0]) { return a[0]; }
        if (value >= a[a.length - 1]) { return a[a.length - 1]; }

        int result = Arrays.binarySearch(a, value);
        if (result >= 0) { return a[result]; }

        int insertionPoint = -result - 1;
        return (a[insertionPoint] - value) < (value - a[insertionPoint - 1]) ?
                a[insertionPoint] : a[insertionPoint - 1];
    }

考虑使用 NavigableSet,特别是 higherlower

只是想到,如果你需要在排序列表中找到所有最接近的值,你可以找到 a 最接近的值,然后找到所有具有与目标的距离相同。在这里,我使用了 3 次二分查找:

  • 首先找到一个最接近的值
  • 第二个找到最左边最接近的值
  • 第三个找到最右边的值

在Python中:

def closest_value(arr, target):
  def helper(arr, target, lo, hi, closest_so_far):
    # Edge case
    if lo == hi:
      mid = lo
      if abs(arr[mid] - target) < abs(arr[closest_so_far] - target):
        closest_so_far = mid
      return closest_so_far

    # General case
    mid = ((hi - lo) >> 1) + lo

    if arr[mid] == target:
      return mid

    if abs(arr[mid] - target) < abs(arr[closest_so_far] - target):
      closest_so_far = mid

    if arr[mid] < target:
      # Search right
      return helper(arr, target, min(mid + 1, hi), hi, closest_so_far)
    else:
      # Search left
      return helper(arr, target, lo, max(mid - 1, lo), closest_so_far)


  if len(arr) == 0:
    return -1
  return helper(arr, target, 0, len(arr) - 1, arr[0])


arr = [0, 10, 14, 27, 28, 30, 47]

attempt = closest_value(arr, 26)
print(attempt, arr[attempt])
assert attempt == 3

attempt = closest_value(arr, 29)
print(attempt, arr[attempt])
assert attempt in (4, 5)


def closest_values(arr, target):
  def left_helper(arr, target, abs_diff, lo, hi):
    # Base case
    if lo == hi:
      diff = arr[lo] - target
      if abs(diff) == abs_diff:
        return lo
      else:
        return lo + 1

    # General case
    mid = ((hi - lo) >> 1) + lo
    diff = arr[mid] - target
    if diff < 0 and abs(diff) > abs_diff:
      # Search right
      return left_helper(arr, target, abs_diff, min(mid + 1, hi), hi)
    elif abs(diff) == abs_diff:
      # Search left
      return left_helper(arr, target, abs_diff, lo, max(mid - 1, lo))
    else:
      # Search left
      return left_helper(arr, target, abs_diff, lo, max(mid - 1, lo))


  def right_helper(arr, target, abs_diff, lo, hi):
    # Base case
    if lo == hi:
      diff = arr[lo] - target
      if abs(diff) == abs_diff:
        return lo
      else:
        return lo - 1

    # General case
    mid = ((hi - lo) >> 1) + lo
    diff = arr[mid] - target
    if diff < 0 and abs(diff) > abs_diff:
      # Search right
      return right_helper(arr, target, abs_diff, min(mid + 1, hi), hi)
    elif abs(diff) == abs_diff:
      # Search right
      return right_helper(arr, target, abs_diff, min(mid + 1, hi), hi)
    else:
      # Search left
      return right_helper(arr, target, abs_diff, lo, max(mid - 1, lo))


  a_closest_value = closest_value(arr, target)
  if a_closest_value == -1:
    return -1, -1

  n = len(arr)
  abs_diff = abs(arr[a_closest_value] - target)
  left = left_helper(arr, target, abs_diff, 0, a_closest_value)
  right = right_helper(arr, target, abs_diff, a_closest_value, n - 1)
  return left, right


arr = [0, 10, 14, 27, 27, 29, 30]

attempt = closest_values(arr, 28)
print(attempt, arr[attempt[0] : attempt[1] + 1])
assert attempt == (3, 5)

attempt = closest_values(arr, 27)
print(attempt, arr[attempt[0] : attempt[1] + 1])
assert attempt == (3, 4)

另一个使用二分查找的 O(log n) 易于理解的解决方案:

public class Solution {
    static int findClosest(int arr[], int n, int target)
    {
        int l=0, h=n-1, diff=Integer.MAX_VALUE, val=arr[0];
        while(l<=h)
        {
            int mid=l+(h-l)/2;
            if(Math.abs(target-arr[mid])<diff)
            {
                diff= Math.abs(target-arr[mid]);
                val=arr[mid];
            }
            if(arr[mid]<target)
                l=mid+1;
            else
                h=mid-1;
        }
        return val;

    }

    public static void main(String[] args) {
        System.out.println(findClosest(new int[]{1,3,6,7}, 4, 3));
    }
}

安德烈的回答是正确的。只是稍微扩展一下。
当您可以使用内置的二进制搜索时,无需重新发明轮子。

您可以通过以下方式找到索引:

int leftIndex = (-Collections.binarySearch(allItems, key) - 2);
int rightIndex = (-Collections.binarySearch(allItems, key) - 1);

列表中的项目需要实施 ComparableStringInteger 等简单类型已经实现了这一点。这是一个示例 https://www.javatpoint.com/Comparable-interface-in-collection-framework.

根据您的用例,为了安全起见,您可能希望在二进制搜索之后执行 index = Math.max(0, index)

实现结果有两种方法-

  1. lower_bound
  2. 二分查找

我更喜欢使用lower_bound方法,它很短:)

int pos=lower_bound(v.begin(),v.end(),value);
if(pos!=0&&target!=v[pos])
    if(abs(v[pos]-value)>abs(value-v[pos-1]))
        pos=pos-1;

二分查找可以参考上面的答案,因为我的代码看起来像是他们代码的复制粘贴。请注意,我在这里全局声明了数组。

binarysearch(int low,int high,int val)
{
    if(val<=arr[0])
        return 0;
    if(val>=arr[n-1])
        return arr[n-1];
    while(low<=high)
    {
        int mid=(low+high)/2;
        if(v[mid]>val)
            return binarysearch(low,mid-1,val);
        else if(v[mid]<val)
            return binarysearch(mid+1,high,val);
        else return mid;
    }
    if(abs(val-v[low])>=abs(val-v[high]))
        return high;
    else 
        return low;
}