C fmod 函数:浮点错误和优化

C fmod function: Floating point error and optimization

我正在尝试以尽可能少的 CPU 周期计算出地球表面从一点到另一点的真实航向。结果应该是 double 0 <= tc < 360,但是在一些特殊情况下,我得到的结果是 360(应该报告为 0)。我意识到这是由于使用 fmod 和浮点数时的机器精度,但最有效的解决方法是什么?

#include <stdio.h>
#include <math.h>

#define EPS 1e-15 // EPS a small number ~ machine precision
#define R2D 57.295779513082320876798154814105   //multiply radian with R2D to get degrees
#define D2R 0.01745329251994329576923690768489  //multiply degrees with D2R to get radians
#define TWO_PI 6.283185307179586476925286766559 //2*Pi

/*----------------------------------------------------------------------------
* Course between points

* We obtain the initial course, tc1, (at point 1) from point 1 to point 2
* by the following. The formula fails if the initial point is a pole. We can
* special case this with as IF statement
----------------------------------------------------------------------------
  Implementation
  Argument 1: INPUT - Pointer to double containing Latitude  of point 1 in degrees
  Argument 2: INPUT - Pointer to double containing Longitude of point 1 in degrees
  Argument 3: INPUT - Pointer to double containing Latitude  of point 2 in degrees
  Argument 4: INPUT - Pointer to double containing Longitude of point 2 in degrees

  RETURN: Double containing initial course in degrees from point1 to point 2
--------------------------------------------------------------------------*/
double _stdcall CourseInitial (double *lat1, double *lon1, double *lat2, double *lon2)
{
    double radLat1 = D2R * *lat1;
    double radLon1 = D2R * *lon1;
    double radLat2 = D2R * *lat2;
    double radLon2 = D2R * *lon2;
    double tc = 0;

    if (cos(radLat1) < EPS) {     // EPS a small number ~ machine precision
      if (radLat1 > 0) {
          tc = 180;  //  starting from N pole
      } else {
          tc = 0;    //  starting from S pole
      }
    } else {
      // Calculate true course [180, 540]
      tc = R2D * atan2(sin(radLon2-radLon1),
                       cos(radLat1) * tan(radLat2) - sin(radLat1) * cos(radLon2-radLon1)
                      ) + 360;
    }

    //Return 0 <= true course < 360
    return fmod(tc, 360);
}

int main(void)
{
    double lat1 = 89;
    double lon1 = 17;
    double lat2 = 68;
    double lon2 = -163;
    double tc = 0;
    tc = CourseInitial(&lat1, &lon1, &lat2, &lon2);
    printf("The course from point 1 to 2 is: %.5f", tc);

    return 0;

}

输出:

The course from point 1 to 2 is: 360.00000

问题所在

两种不同级别的优化结果出现差异 在计算 radLon2-radLon1 时。此计算的结果显示在此处。

-O0 radLon2-radLon1 = 0xC00921FB54442D19
-O1 radLon2-radLon1 = 0xC00921FB54442D18

不同之处在于最低有效位使 -O0 结果超过了 pi 的实际值。

-O0 radLon2-radLon1 = -3.14159265358979356008717331861
Pi with 50 decimals =  3.14159265358979323846264338327950288419716939937510
-O1 radLon2-radLon1 = -3,14159265358979311599796346854

-O0 计算在浮点堆栈上加载一个值并减去另一个值 FSUB 命令(第 004014a8 行:)。 -O1 计算将两个值加载到浮点堆栈 并用 FSUBP 命令减去它们(第 00401480 行:)

无优化编译后反汇编-O0

(...)
004014a5:   fld QWORD PTR [ebp-0x30]     // ST(0) = radLon2         (from [ebp-0x30])
004014a8:   fsub QWORD PTR [ebp-0x20]    // ST(0) = ST(0) - radLon1 (from [ebp-0x20])
004014ab:   fstp QWORD PTR [esp]         // [esp] = (radLon2-radLon1) = 0xC00921FB54442D19
004014ae:   call 0x4080d0 <sin>          // ST(0) = sin(radLon2-radLon1)
(...)
-------------------------------------------------------------------------------------
Significant values:

radLon2 =              0xC006C253F2B53437  (-2.84488668075075734620327239099     )     
radLon1 =              0x3FD2FD3B0C77C70D  ( 0.296705972839036047350447233839    )    
radLon2-radLon1 =      0xC00921FB54442D19  (-3.14159265358979356008717331861     ) 
sin(radLon2-radLon1) = 0x3CB72D0000000000  ( 3.21628574467824890348310873378e-16 )                       

Later atan2(y, x) is calculated with these values
x =           0x3FF0B04ED1755590  ( 1.04304391688978981278523860965     )
y =           0x3CB72D0000000000  ( 3.21628574467824890348310873378e-16 )
atan2(y, x) = 0x3CB63828CAA39659  ( 3.08355735803412799607014393888e-16 ) 

编译后反汇编优化-O1

(...)
29            double radLon2 = D2R * *lon2;
0040146c:   fld QWORD PTR ds:0x40a0c0                   // ST(0) = D2R          (from [ds:0x40a0c0]) 
00401472:   fmul QWORD PTR [esp+0x30]                   // ST(0) = ST(0) * lon2 (from [ESP+0x30])
                                                        // ST(0) = -2.8448866807507573
27            double radLon1 = D2R * *lon1;
00401476:   fld QWORD PTR ds:0x40a0c0                   // ST(0) = D2R          (from [ds:0x40a0c0]) 
0040147c:   fmul QWORD PTR [esp+0x20]                   // ST(0) = ST(0) * lon1 (from [ESP+0x20])
                                                        // ST(0) = 0.29670597283903605 (radLon1)
                                                        // ST(1) = -2.8448866807507573 (radLon2)
00401480:   fsubp st(1),st                              // ST(1) = ST(1)-ST(0) then POP stack 
00401482:   fst QWORD PTR [esp+0x20]                    // [esp+0x20] = (radLon2-radLon1) = 0xC00921FB54442D18 
(...)

40              tc = R2D * atan2(sin(radLon2-radLon1),
00401492:   fld QWORD PTR [esp+0x20]                    // ST(0)=(radLon2-radLon1)
00401496:   fstp QWORD PTR [esp]                        // [esp]=(radLon2-radLon1)
00401499:   call 0x4080e0 <sin>                         // ST(0)=sin(radLon2-radLon1)
(...)
-------------------------------------------------------------------------------------
Significant values

radLon2 =              0xC006C253F2B53437  (-2.84488668075075734620327239099     )     
radLon1 =              0x3FD2FD3B0C77C70D  ( 0.296705972839036047350447233839    )    
radLon2-radLon1 =      0xC00921FB54442D18  (-3,14159265358979311599796346854     ) 
sin(radLon2-radLon1) = 0xBCA1A60000000000  (-1.22460635382237725821141793858e-16 )                       

Later atan2(y, x) is calculated with these values
x =           0x3FF0B04ED1755590  ( 1.04304391688978981278523860965     )
y =           0xBCA1A60000000000  (-1.22460635382237725821141793858e-16 )
atan2(y, x) = 0xBCA0EB8D90F27437  (-1.1740697913027295863036855646E-16 ) 
=====================================================================================

尝试的解决方案

在 CourseInitial() 中,radLon1 和 radLon2 不是独立使用的。所以我尝试了以下方法。

double radDeltaLon = D2R * (*lon2-*lon1);
(...)
tc = R2D * atan2(sin(radDeltaLon),
                 cos(radLat1) * tan(radLat2) - sin(radLat1) * cos(radDeltaLon)
                ) + 360;

这没有用。调试显示接近 Pi 的问题值出现了 代码中的另一个地方,最终结果是一样的。

一个解决方案

在每个定义的常量的末尾,我添加了一个 L 并将它们转换为 Long Doubles(80 位浮点数)。这与 CPU 在其浮点寄存器中的精度相同,并在某些情况下解决了该问题。

#define R2D 57.295779513082320876798154814105L   //multiply radian with R2D to get degrees
#define D2R 0.01745329251994329576923690768489L  //multiply degrees with D2R to get radians
#define TWO_PI 6.283185307179586476925286766559L //2*Pi

最终解决方案

// Calculate true course [-180, 180)
tc = atan2(sin(radDeltaLon),
           cos(radLat1) * tan(radLat2) - sin(radLat1) * cos(radDeltaLon)
          );

if (fabs(tc) < EPS) {
    tc = 0;  //Prevents fmod(tc, 360) from returning 360 due to rounding error
} else {
    tc *= R2D; //Convert to degrees after tc has been checked for machine precision
    tc += 360; //tc [180, 540)
}
return fmod(tc, 360); // returns tc [0, 360)

常数D2R的给定值与最接近的64位和80位浮点数的比较

80 bit 0x3FF98EFA351294E9C8AF = 1.745329251994329577083321213687439055206596094649285078048706054e-2
D2R                           = 1.745329251994329576923690768489e-2
80 bit 0x3FF98EFA351294E9C8AE = 1.74532925199432957691391462423657898739293159451335668563842773e-2

64 bit 0x3F91DF46A2529D3A     = 1.745329251994329894381863255148346070200204849243164e-2
D2R                           = 1.745329251994329576923690768489e-2
64 bit 0x3F91DF46A2529D39     = 1.7453292519943295474371680597869271878153085708618164e-2

这些是我的编译器选择的值:

0x3FF98EFA351294E9C8AE = 0 011111111111001 1000111011111010001101010001001010010100111010011100100010101110
0x3F91DF46A2529D39     = 0     01111111001  0001110111110100011010100010010100101001110100111001

转换是使用下列网页上的工具和信息执行的:
http://www.exploringbinary.com/binary-converter
http://apfloat.appspot.com
http://en.wikipedia.org/wiki/Extended_precision
http://en.wikipedia.org/wiki/Double-precision_floating-point_format