C++ 中的有符号整数值溢出？

Question

我有一个遗留代码库，我们正在尝试将其从 devtoolset-4 迁移到 devtoolset-7。我注意到一个关于有符号整数溢出的有趣行为（int64_t，具体而言）。

有一个代码片段用于在乘以一大组整数时检测整数溢出：

// a and b are int64_t
int64_t product = a * b; 
if (b != 0 && product / b != a) {
    // Overflow
}

这段代码在 devtoolset-4 上运行良好。但是，使用 devtoolset-7，永远不会检测到溢出。

例如：当a = 83802282034166和b = 98765432时， product 变为 -5819501405344925872（显然值已经溢出）。

但是 product / b 的结果等于 a (83802282034166)。因此 if 条件永远不会变为真。它的值应该是根据溢出的（负）product值计算的：-5819501405344925872 / 98765432 = -58922451788

具有讽刺意味的是，数学是正确的，但它导致了关于 devtoolset-4 的异常行为。

编译器是否可以缓存值（而不是重新评估它）导致这种行为？
或者编译器优化转换语句product / b != a 到 product != a * b 并达到相同的溢出值（或者可能只是跳过基于上述语句的计算 product = a * b）？

我知道有符号整数溢出是 C++ 中的 'undefined behaviour'，因此编译器行为可能会因实现而异。但是有人可以帮助我理解上述行为吗？

注意：devtoolset-4和devtoolset-7中的g++版本分别是g++ (GCC) 5.2和g++ (GCC) 7.2.1。

Answer 1

could someone help me make sense of the above behaviour?

有符号整数溢出在 C++ 中有未定义的行为。这意味着您无法可靠地检测到它，并且包含有符号整数溢出的代码可以做任何事情。

如果要检测某个操作是否会导致有符号整数溢出，则需要在溢出发生之前进行检测，以防止UB发生。

Answer 2

Signed integer overflow is undefined behavior。这不同于 unsigned int（所有无符号整数）。有关此内容的更多信息 here

作为旁注，人们注意到使用 int 而不是 unsigned int 可以提高性能（参见 here），因为编译器不处理溢出行为。

Answer 3

因为有符号 overflow/underflow 被归类为未定义行为，允许编译器作弊并假设它不会发生（这是在一两年前的 Cppcon 演讲中提出的，但我忘记了演讲我的头顶）。因为你在做算术然后检查结果，优化器会优化掉部分检查。

这是 未经测试的 代码，但您可能需要如下内容：

if(b != 0) {
    auto max_a = std::numeric_limits<int64_t>::max() / b;
    if(max_a < a) {
        throw std::runtime_error{"overflow"};
    }
}
return a * b;

请注意，此代码不处理下溢；如果 a * b 可以为负数，则此检查将无效。

根据 Godbolt，您可以看到您的版本已经完全优化了检查。

Answer 4

有符号整数溢出是 C++ 中的未定义行为。

这意味着优化器可以假设它永远不会发生。 a*b/b 是 a，句号。

现代编译器进行基于静态单一赋值的优化。

// a and b are int64_t
int64_t product = a * b;
if (b != 0 && product / b != a) {
  // Overflow
}

变为：

const int64_t __X__ = a * b; 
const bool __Y__ = b != 0;
const int64_t __Z__ = __X__ / b;
const int64_t __Z__ = a*b / b;
const int64_t __Z__ = a;

if (__Y__ && __Z__ != a) {
  // Overflow
}

计算结果为

if (__Y__ && false) {
  // Overflow
}

显然，因为 __Z__ 是 a 而 a!=a 是 false。

int128_t big_product = a * b;

使用 big_product 并检测那里的溢出。

SSA 允许编译器实现诸如 (a+1)>a 之类的事情始终为真，这可以简化许多循环和优化案例。该事实依赖于有符号值溢出是不安全行为这一事实。

Answer 5

如果您担心整数溢出，最好不要使用任意精度整数库 - 有了这个，您可以将大小类型增加到 128 位，不用担心。

https://gmplib.org/

Answer 6

根据product == a * b的知识，compiler/optimizer可以采取以下优化步骤：

b != 0 && product / b != a
b != 0 && a * b / b != a
b != 0 && a * 1 != a
b != 0 && a != a
b != 0 && false
false

优化器可以选择完全删除分支。

I understand that signed integer overflow is an 'undefined behaviour' in C++ and so the compiler behaviour could change across implementations. But could someone help me make sense of the above behaviour?

您可能知道有符号整数溢出是UB，但我想您还没有掌握UB 的真正含义。 UB 不需要，而且通常没有意义。不过这个案例看起来很简单。

Answer 7

你可以阅读这个文档，它可能对你有用，好像我在变量和数据类型方面遇到任何问题我直接去阅读它：http://www.cplusplus.com/doc/tutorial/variables/

C++ 中的有符号整数值溢出？

Signed Integer value overflow in C++?

c++

integer-overflow

devtoolset