理解 heisenbug 示例：寄存器与主存储器的不同精度

Question

我阅读了有关 heisenbug 的 wiki 页面，但不理解这个示例。能有人详细解释一下吗？

One common example of a heisenbug is a bug that appears when the program is compiled with an optimizing compiler, but not when the same program is compiled without optimization (as is often done for the purpose of examining it with a debugger). While debugging, values that an optimized program would normally keep in registers are often pushed to main memory. This may affect, for instance, the result of floating-point comparisons, since the value in memory may have smaller range and accuracy than the value in the register.

Answer 1

想法是将代码编译为两种状态 - 一种是正常或调试模式，另一种是优化或生产模式。

正如了解在量子级别发生了什么很重要一样，我们也应该知道我们的代码在编译器级别发生了什么！

Answer 2

这是最近发布的具体示例：

这是一个非常好的标本，因为我们都可以复制它：http://ideone.com/rjY5kQ

这些错误非常依赖于平台的非常精确的功能，人们也发现它们很难重现。

在这种情况下，当省略 'print-out' 时，程序会在 CPU 寄存器内执行高精度比较（高于存储在 double 中的精度）。但是为了打印出值，编译器决定将结果移动到主内存，这会导致隐式截断精度。当它使用该截断值进行比较时，它成功了。

#include <iostream>
#include <cmath>
  
double up = 19.0 + (61.0/125.0);
double down = -32.0 - (2.0/3.0);
double rectangle = (up - down) * 8.0;
 
double f(double x) {
    return (pow(x, 4.0)/500.0) - (pow(x, 2.0)/200.0) - 0.012;
}
 
double g(double x) {
    return -(pow(x, 3.0)/30.0) + (x/20.0) + (1.0/6.0);
}
 
double area_upper(double x, double step) {
    return (((up - f(x)) + (up - f(x + step))) * step) / 2.0;
}
 
double area_lower(double x, double step) {
    return (((g(x) - down) + (g(x + step) - down)) * step) / 2.0;
}
 
double area(double x, double step) {
    return area_upper(x, step) + area_lower(x, step);
}
 
int main() {
    double current = 0, last = 0, step = 1.0;
 
    do {
        last = current;
        step /= 10.0;
        current = 0;
 
        for(double x = 2.0; x < 10.0; x += step) current += area(x, step);
 
        current = rectangle - current;
        current = round(current * 1000.0) / 1000.0;
        //std::cout << current << std::endl; //<-- COMMENT BACK IN TO "FIX" BUG
    } while(current != last);
 
    std::cout << current << std::endl;
    return 0;
}

编辑：已验证的错误和修复仍然显示：03-FEB-22、20-Feb-17

Answer 3

它来自 Uncertainty Principle，它基本上说明了可以同时知道粒子的某些物理特性对的精度存在基本限制。如果你开始过于仔细地观察某个粒子（即，你精确地知道它的位置），那么你就无法精确地测量它的动量。（而且如果你有精确的速度，那么你就不能说出它的确切位置）

所以接下来，Heisenbug 是一个当你仔细观察时就会消失的错误。

在你的例子中，如果你需要程序运行良好，你会优化编译它，并且会出现错误。但是一旦进入调试模式，你就不会优化编译它，这将消除错误。

所以如果你开始太仔细地观察这个bug，你会不确定它的性质（或者无法找到它），这类似于海森堡的不确定性原理，因此称为Heisenbug。

理解 heisenbug 示例：寄存器与主存储器的不同精度

Understanding heisenbug example: different precision in registers vs main memory

terminology