为什么Newton-Raphson方法在计算1.0E21和1.0E23的平方根时不收敛？

Question

我有以下函数可以使用 Newton-Raphson 方法计算数字的平方根。

double get_dx(double y, double x)
{
   return (x*x - y)/(2*x);
}

double my_sqrt(double y, double initial)
{
   double tolerance = 1.0E-6;
   double x = initial;
   double dx;
   int iteration_count = 0;
   while ( iteration_count < 100 &&
           fabs(dx = get_dx(y, x)) > tolerance )
   {
      x -= dx;
      ++iteration_count;
      if ( iteration_count > 90 )
      {
         printf("Iteration number: %d, dx: %lf\n", iteration_count, dx);
      }
   }

   if ( iteration_count < 100 )
   {
      printf("Got the result to converge in %d iterations.\n", iteration_count);
   }
   else
   {
      printf("Could not get the result to converge.\n");
   }

   return x;
}

它们适用于大多数号码。但是，当 y 为 1.0E21 和 1.0E23 时，它们不会收敛。可能还有其他数字，我还不知道，函数不收敛。

我测试的初始值为：

y*0.5 -- 可以开始的东西。
1.0E10 -- 最终结果附近的数字。
sqrt(y)+1.0 -- 显然，这是一个非常接近最终答案的数字。

在所有情况下，1.0E21 和 1.0E23 的函数都无法收敛。我尝试了一个较小的数字 1.0E19 和一个较大的数字 1.0E25。两者都不是问题。

这是一个完整的程序：

#include <stdio.h>
#include <math.h>

double get_dx(double y, double x)
{
   return (x*x - y)/(2*x);
}

double my_sqrt(double y, double initial)
{
   double tolerance = 1.0E-6;
   double x = initial;
   double dx;
   int iteration_count = 0;
   while ( iteration_count < 100 &&
           fabs(dx = get_dx(y, x)) > tolerance )
   {
      x -= dx;
      ++iteration_count;
      if ( iteration_count > 90 )
      {
         printf("Iteration number: %d, dx: %lf\n", iteration_count, dx);
      }
   }

   if ( iteration_count < 100 )
   {
      printf("Got the result to converge in %d iterations.\n", iteration_count);
   }
   else
   {
      printf("Could not get the result to converge.\n");
   }

   return x;
}

void test(double y)
{
   double ans = my_sqrt(y, 0.5*y);
   printf("sqrt of %lg: %lg\n\n", y, ans);

   ans = my_sqrt(y, 1.0E10);
   printf("sqrt of %lg: %lg\n\n", y, ans);

   ans = my_sqrt(y, sqrt(y) + 1.0);
   printf("sqrt of %lg: %lg\n\n", y, ans);
}

int main()
{
   test(1.0E21);
   test(1.0E23);
   test(1.0E19);
   test(1.0E25);
}

这是输出（基于 64 位 Linux，使用 gcc 4.8.4）：

Iteration number: 91, dx: -0.000002
Iteration number: 92, dx: 0.000002
Iteration number: 93, dx: -0.000002
Iteration number: 94, dx: 0.000002
Iteration number: 95, dx: -0.000002
Iteration number: 96, dx: 0.000002
Iteration number: 97, dx: -0.000002
Iteration number: 98, dx: 0.000002
Iteration number: 99, dx: -0.000002
Iteration number: 100, dx: 0.000002
Could not get the result to converge.
sqrt of 1e+21: 3.16228e+10

Iteration number: 91, dx: 0.000002
Iteration number: 92, dx: -0.000002
Iteration number: 93, dx: 0.000002
Iteration number: 94, dx: -0.000002
Iteration number: 95, dx: 0.000002
Iteration number: 96, dx: -0.000002
Iteration number: 97, dx: 0.000002
Iteration number: 98, dx: -0.000002
Iteration number: 99, dx: 0.000002
Iteration number: 100, dx: -0.000002
Could not get the result to converge.
sqrt of 1e+21: 3.16228e+10

Iteration number: 91, dx: 0.000002
Iteration number: 92, dx: -0.000002
Iteration number: 93, dx: 0.000002
Iteration number: 94, dx: -0.000002
Iteration number: 95, dx: 0.000002
Iteration number: 96, dx: -0.000002
Iteration number: 97, dx: 0.000002
Iteration number: 98, dx: -0.000002
Iteration number: 99, dx: 0.000002
Iteration number: 100, dx: -0.000002
Could not get the result to converge.
sqrt of 1e+21: 3.16228e+10

Iteration number: 91, dx: 0.000027
Iteration number: 92, dx: 0.000027
Iteration number: 93, dx: 0.000027
Iteration number: 94, dx: 0.000027
Iteration number: 95, dx: 0.000027
Iteration number: 96, dx: 0.000027
Iteration number: 97, dx: 0.000027
Iteration number: 98, dx: 0.000027
Iteration number: 99, dx: 0.000027
Iteration number: 100, dx: 0.000027
Could not get the result to converge.
sqrt of 1e+23: 3.16228e+11

Iteration number: 91, dx: -0.000027
Iteration number: 92, dx: -0.000027
Iteration number: 93, dx: -0.000027
Iteration number: 94, dx: -0.000027
Iteration number: 95, dx: -0.000027
Iteration number: 96, dx: -0.000027
Iteration number: 97, dx: -0.000027
Iteration number: 98, dx: -0.000027
Iteration number: 99, dx: -0.000027
Iteration number: 100, dx: -0.000027
Could not get the result to converge.
sqrt of 1e+23: 3.16228e+11

Iteration number: 91, dx: 0.000027
Iteration number: 92, dx: 0.000027
Iteration number: 93, dx: 0.000027
Iteration number: 94, dx: 0.000027
Iteration number: 95, dx: 0.000027
Iteration number: 96, dx: 0.000027
Iteration number: 97, dx: 0.000027
Iteration number: 98, dx: 0.000027
Iteration number: 99, dx: 0.000027
Iteration number: 100, dx: 0.000027
Could not get the result to converge.
sqrt of 1e+23: 3.16228e+11

Got the result to converge in 35 iterations.
sqrt of 1e+19: 3.16228e+09

Got the result to converge in 6 iterations.
sqrt of 1e+19: 3.16228e+09

Got the result to converge in 1 iterations.
sqrt of 1e+19: 3.16228e+09

Got the result to converge in 45 iterations.
sqrt of 1e+25: 3.16228e+12

Got the result to converge in 13 iterations.
sqrt of 1e+25: 3.16228e+12

Got the result to converge in 1 iterations.
sqrt of 1e+25: 3.16228e+12

谁能解释为什么上述函数对 1.0E21 和 1.0E23 不收敛？

Answer 1

这是一个浮点数精度问题，结合使用固定的tolerance。

问题是，如果被减去的数字足够大，您可以获得以下两种效果之一：(1) "false" 零差，这不是问题，或者(2) 由于差异持续大于公差而导致收敛失败。

使用 1e-6 固定容差的另一个问题是它不适用于小数字。例如，假设您要计算 1e-16 的平方根。平方根将为 1e-8。收敛测试将错误地决定它在第一次迭代时就找到了解决方案。在这种情况下，需要更小的公差。

更复杂的收敛测试可能会有意义。例如，检查当前估计值是否比先前估计值更接近。

Answer 2

这个答案具体解释了为什么算法无法收敛输入 1e23。

您面临的问题称为 "small difference of large numbers"。具体来说，您正在计算 (x*x - y)/(2*x)，其中 y 是 1e23，而 x 大约是 3.16e11。

1e23 的 IEEE-754 encoding 是 0x44b52d02c7e14af6。因此指数为 0x44b，十进制为 1099。这必须减少 1023，因为指数被编码为偏移二进制。然后你必须再减去52才能找到一个LSB的权重。

0x44b ==>    1099 - 1023 - 52 = 24    ==> 1 LSB = 2^24

因此单个 LSB 的值为 16777216。

这段代码的结果

double y = 1e23;
double x = sqrt(y);
double dx = x*x - y;
printf( "%lf\n", dx );

实际上是-16777216。

因此，当您计算 x*x - y 时，结果要么为零，要么为 16777216 的倍数。将 16777216 除以 2*x 即 2*3.16e11给出0.000027的误差，不在你的容忍范围内。

最接近 sqrt(y) 的两个值是

0x4252682931c035a0
0x4252682931c035a1

当你对这些数字进行平方时，你会得到

0x44b52d02c7e14af5
0x44b52d02c7e14af7

所以没有一个符合期望的结果，即

0x44b52d02c7e14af6

因此算法永远无法收敛。

算法对 1e25 有效的原因是运气。 1e25 的编码是

0x45208b2a2c280291

sqrt(1e25) 的编码是

0x428702337e304309

当你对这个数字求平方时，你会得到

0x45208b2a2c280291

因此 x*x - y 正好是 0，算法很幸运。

为什么Newton-Raphson方法在计算1.0E21和1.0E23的平方根时不收敛？

Why won't the Newton-Raphson method converge when computing the square roots of 1.0E21 and 1.0E23?

c

algorithm

newtons-method