什么时候下溢？

Question

我遇到这样一种情况，计算 1.77e-308/10 会触发下溢异常，但计算 1.777e-308/10 不会。这很奇怪，因为：

Underflow occurs when the true result of a floating point operation is smaller in magnitude (that is, closer to zero) than the smallest value representable as a normal floating point number in the target datatype (from Arithmetic Underflow, Wikipedia)

换句话说，如果我们计算x/y，其中x和y都是double，那么如果0 < |x/y| < 2.2251e-308（最小的正归一化 double 是 2.2251e-308）。因此，理论上，1.77e-308/10 和 1.777e-308/10 都应该触发下溢异常。该理论与我用下面的 C 程序测试的结果相矛盾。

#include <stdio.h>
#include <fenv.h>
#include <math.h>


int main(){
  double x,y;

  // x = 1.77e-308 => underflow
  // x = 1.777e-308 gives  ==> no underflow
  x=1.77e-308;

  feclearexcept(FE_ALL_EXCEPT);
  y=x/10.0;
  if (fetestexcept(FE_UNDERFLOW)) {
    puts("Underflow\n");
  }
  else puts("No underflow\n");
}

为了编译程序，我使用了gcc program.c -lm；我也尝试了 Clang，它给了我同样的结果。有什么解释吗？

[编辑] 我已经通过 this online IDE 分享了上面的代码。

Answer 1

下溢不仅是范围的问题，也是precision/rounding的问题。

7.12.1 Treatment of error conditions
The result underflows if the magnitude of the mathematical result is so small that the mathematical result cannot be represented, without extraordinary roundoff error, in an object of the specified type. C11 §7.12.1 6

1.777e-308，转换为最接近的binary64 0x1.98e566222bcfcp-1023，恰好有一个有效数字（0x198E566222BCFC，7193376082541820）是10的倍数。所以除以10是精确的。没有舍入误差。

我发现使用十六进制表示法更容易演示。请注意除以 2 总是精确的，除了最小值。

#include <float.h>
#include <stdio.h>
#include <fenv.h>
#include <math.h>

int uf_test(double x, double denominator){
  printf("%.17e %24a ", x, x);
  feclearexcept(FE_ALL_EXCEPT);
  double y=x/denominator;
  int uf = !!fetestexcept(FE_UNDERFLOW);
  printf("%-24a %s\n", y, uf ? "Underflow" : "");
  return uf;
}

int main(void) {
  uf_test(DBL_MIN, 2.0);
  uf_test(1.777e-308, 2.0);
  uf_test(1.77e-308, 2.0);
  uf_test(DBL_TRUE_MIN, 2.0);

  uf_test(pow(2.0, -1000), 10.0);
  uf_test(DBL_MIN, 10.0);
  uf_test(1.777e-308, 10.0);
  uf_test(1.77e-308, 10.0);
  uf_test(DBL_TRUE_MIN, 10.0);
  return 0;
}

输出

2.22507385850720138e-308                0x1p-1022 0x1p-1023                
1.77700000000000015e-308  0x1.98e566222bcfcp-1023 0x1.98e566222bcfcp-1024  
1.77000000000000003e-308  0x1.97490d21e478cp-1023 0x1.97490d21e478cp-1024  
4.94065645841246544e-324                0x1p-1074 0x0p+0                   Underflow

// No underflow as inexact result is not too small
9.33263618503218879e-302                0x1p-1000 0x1.999999999999ap-1004  
// Underflow as result is too small and inexact
2.22507385850720138e-308                0x1p-1022 0x1.99999999999ap-1026   Underflow
// No underflow as result is exact
1.77700000000000015e-308  0x1.98e566222bcfcp-1023 0x1.471deb4e8973p-1026   
1.77000000000000003e-308  0x1.97490d21e478cp-1023 0x1.45d40a818394p-1026   Underflow
4.94065645841246544e-324                0x1p-1074 0x0p+0                   Underflow

Answer 2

查看您调用的函数的文档，得出定义：

FE_UNDERFLOW the result of an earlier floating-point operation was subnormal with a loss of precision

http://en.cppreference.com/w/c/numeric/fenv/FE_exceptions

我想你已经确认你的号码不正常。该测试还包括精度损失。如果您打印更多有效数字，您会发现报告溢出的数字似乎确实丢失了大约 16 位小数的精度。我不清楚次正规数有多少有效数字，但我认为这一定是你的答案。

什么时候下溢？

When does underflow occur?

c

floating-point

floating-point-exceptions

underflow