在列表中查找匹配实值的算法

Question

我有一个计算函数 f(x) 结果的复杂算法。在现实世界中，f(x) 是一个连续函数。然而，由于算法中的舍入误差，计算机程序中的情况并非如此。下图举例：

此外，我有一个包含数千个值 Fi 的列表。

我正在寻找满足 Fi 值的所有 x 值，即 f(xi)=Fi

我可以通过像下面的伪代码一样简单地遍历 x 值来解决这个问题：

for i=0 to NumberOfChecks-1 do
begin

  //calculate the function result with the algorithm
  x=i*(xmax-xmin)/NumberOfChecks;
  FunctionResult=CalculateFunctionResultWithAlgorithm(x);

  //loop through the value list to see if the function result matches a value in the list
  for j=0 to NumberOfValuesInTheList-1 do
  begin
    if Abs(FunctionResult-ListValues[j])<Epsilon then
    begin
      //mark that element j of the list matches 
      //and store the corresponding x value in the list
    end
  end

end

当然要使用大量的检查。否则我会错过一些 x 值。检查的次数越多，结果就越完整和准确。列表完成 90% 或 95% 是可以接受的。

问题在于这种蛮力方法需要花费太多时间。正如我之前提到的，f(x) 的算法非常复杂，而且检查次数多，需要花费太多时间。

对于这个问题，什么是更好的解决方案？

Answer 1

最好的方法取决于函数 f(x) 的性质。

最好的解决方案是，如果您可以创建到 F(x) 的反转并使用它
如你所说F(x)是连续的：
因此，您可以开始评估少量的远点，然后找到有意义的范围，并针对 f(x)=Fi
的 x 优化 "assumption" 它不是防弹的，但它是一种选择。

例如Fi=5.7; f(1)=1.4 ,f(4)=4,f(16)=12.6, f(10)=10.1, f(7)=6.5, f(5)=5.1, f(6)=5.8，可以取5 < x < 7
和#1在同一行，如果F(x)很难计算，可以使用Interpolation，然后只在可能的值上计算F(x) .

Answer 2

对列表进行排序，生成一个包含以下内容的数组 SortedListValues 已排序的 ListValues 和一个数组 SortedListValueIndices 包含中每个条目的原始数组中的索引 SortedListValues。你实际上只需要其中的第二个和您可以通过对数组进行排序来通过一次排序创建它们使用 value 作为排序键的 (value, index) 的元组。
遍历 0..NumberOfChecks-1 中的范围并计算每一步的函数值，然后使用二进制印章在排序列表中搜索它的方法。

伪代码：

// sort as described above
SortedListValueIndices = sortIndices(ListValues);

for i=0 to NumberOfChecks-1 do
begin

  //calculate the function result with the algorithm
  x=i*(xmax-xmin)/NumberOfChecks;
  FunctionResult=CalculateFunctionResultWithAlgorithm(x);

  // do a binary chop to find the closest element in the list
  highIndex = NumberOfValuesInTheList-1;
  lowIndex = 0;
  while true do
  begin
     if Abs(FunctionResult-ListValues[SortedListValueIndices[lowIndex]])<Epsilon then
     begin
       // find all elements in the range that match, breaking out
       // of the loop as soon as one doesn't
       for j=lowIndex to NumberOfValuesInTheList-1 do
       begin
         if Abs(FunctionResult-ListValues[SortedListValueIndices[j]])>=Epsilon then    
           break
         //mark that element SortedListValueIndices[j] of the list matches 
         //and store the corresponding x value in the list
       end
       // break out of the binary chop loop
       break
     end

     // break out of the loop once the indices match
     if highIndex <= lowIndex then
        break

     // do the binary chop searching, adjusting the indices:
     middleIndex = (lowIndex + 1 + highIndex) / 2;
     if ListValues[SortedListValueIndices[middleIndex] < FunctionResult then
         lowIndex = middleIndex;
     else          
     begin
         highIndex = middleIndex;        
         lowIndex = lowIndex + 1;
     end
  end
end

可能的并发症：

二进制印章没有考虑 epsilon。根据你的数据这可能是也可能不是问题。如果可以接受的话列表只完成了 90% 或 95% 这可能没问题。如果不是那么您需要扩大范围以将其考虑在内。
我假设您希望能够为每个 FunctionResult 匹配多个 x 值。如果没有必要，您可以简化代码。

Answer 3

另一种方法分为两部分：生成所有结果，对它们进行排序，然后与现有结果的排序列表合并。

第一步是计算所有结果并将它们与生成它们的 x 值一起保存。即：

results = list of <x, result>

for i = 0 to numberOfChecks
    //calculate the function result with the algorithm
    x=i*(xmax-xmin)/NumberOfChecks;
    FunctionResult=CalculateFunctionResultWithAlgorithm(x);
    results.Add(x, FunctionResult)
end for

现在，按 FunctionResult 对 results 列表进行排序，并按结果对 FunctionResult-ListValues 数组进行排序。

您现在有两个可以线性移动的排序列表：

i = 0, j = 0;
while (i < results.length && j < ListValues.length)
{
    diff = ListValues[j] - results[i];
    if (Abs(diff) < Episilon)
    {
        // mark this one with the x value
        // and move to the next result
        i = i + 1
    }
    else if (diff > 0)
    {
        // list value is much larger than result. Move to next result.
        i = i + 1
    }
    else
    {
        // list value is much smaller than result. Move to next list value.
        j = j + 1
    }
}

Answer 4

当然这在很大程度上取决于数据，尤其是 Fi 的数值分布。另一个问题是 f(x) 看起来非常跳跃，消除了 "assumption of nearby value".

的概念

但可以优化搜索。

下图

以足够的粒度遍历 F(x)，定义一个粗略的 min （红线）和 max（绿线），使用合适的公差（"air" 或介于两者之间的 "gap"）。最小值和最大值之间的区域是 "AREA".
查看每个Fi值命中区域的位置，相应地在X轴上做一个堆叠标记（"MARKING"）（可以是 X 的多个段）。
很多 MARKINGs 在彼此的顶部（更高的和 - 垂直的黑色 "sum" 箭头），做 dense hit tests，从而增加整体获得尽可能多的点击率的机会。在其他地方做更多的稀疏测试。
尽可能收紧此模式（降低容忍度）。
编辑：Fi 有点混乱。它是有序数组还是随机顺序（正如我假设的那样）？

Answer 5

Jim Mischel 的解决方案适用于 O(i+j) 而不是您当前拥有的 O(i*j) 解决方案。但是，他的代码中有一个（非常）小的错误。正确的代码是：

    diff = ListValues[j] - results[i]; //no abs() here
    if (abs(diff) < Episilon) //add abs() here
    {
        // mark this one with the x value
        // and move to the next result
        i = i + 1
    }

在列表中查找匹配实值的算法

Algorithm to find matching real values in a list

iteration

algorithm

numeric

numerical-methods