C - 添加两个单精度浮点正常数,不能得到无穷大的结果

C - adding two single-precision floating point normal numbers, can't get result to infinity

我正在玩浮点运算,我遇到了一些需要解释的东西。

将舍入模式设置为 'towards zero' 时,又名:

fesetround(FE_TOWARDZERO);

加上不同种类的正常正数,我永远无法达到无穷大。

但是,从 ieee 745 中得知,有限数相加可能导致溢出到无穷大。

例如:

#include <fenv.h>
#include <stdio.h>

float hex2float (int hex_num) {
  return *(float*)&hex_num;
}

void main() {
  int a_int = 0x7f7fffff; // Maximum finite single precision number, about 3.4E38
  int b_int = 0x7f7fffff;
  float a = hex2float(a_int);
  float b = hex2float(b_int);
  float res_add;

  fesetround(FE_TOWARDZERO);  // need to include fenv.h for that
  printf("Calculating... %+e + %+e\n",a,b);
  res_add = a + b;
  printf("Res = %+e\n",res_add);
}

但是,如果我将舍入模式更改为其他模式,我可能会得到 +INF 作为答案。

Can someone explain this?

对观察到的行为的解释是,它是 IEEE 754-2008 浮点标准强制要求的:

7.4 Overflow

The overflow exception shall be signaled if and only if the destination format’s largest finite number is exceeded in magnitude by what would have been the rounded floating-point result (see 4) were the exponent range unbounded. The default result shall be determined by the rounding-direction attribute and the sign of the intermediate result as follows:

[...]

b) roundTowardZero carries all overflows to the format’s largest finite number with the sign of the intermediate result.

所以对于这里使用的舍入模式(截断,或向零舍入),溢出情况下的结果是最大的有限数,不是无穷大。