为什么 C++ 中 size_t 和 unsigned int 的模除法会出错

Why does modulo division go wrong for mix of size_t and unsigned int in C++

给定一个程序

#include <iostream>
using namespace std;

int main()
{
     const size_t DoW = 7;
     const unsigned int DAYS_OF_WEEK = static_cast<unsigned int> (DoW);
     unsigned int dayOfFirstDay = 0;
     unsigned int _firstDayOfWeek = 1;
     unsigned int diff = (DAYS_OF_WEEK+ (dayOfFirstDay - _firstDayOfWeek) ) % DAYS_OF_WEEK;
     cout << "diff = ("  << DAYS_OF_WEEK << " + (" << dayOfFirstDay << " - " << _firstDayOfWeek << ")) %" << DAYS_OF_WEEK
         << " = " << diff << endl;
     return 0;
}

该程序的输出是

diff = (7 + (0 - 1)) %7 = 6

这是意料之中的。但是没有 static_cast

的修改程序
#include <iostream>
using namespace std;

int main()
{
     const size_t DAYS_OF_WEEK = 7;
     unsigned int dayOfFirstDay = 0;
     unsigned int _firstDayOfWeek = 1;
     unsigned int diff = (DAYS_OF_WEEK+ (dayOfFirstDay - _firstDayOfWeek) ) % DAYS_OF_WEEK;
     cout << "diff = ("  << DAYS_OF_WEEK << " + (" << dayOfFirstDay << " - " << _firstDayOfWeek << ")) %" << DAYS_OF_WEEK
         << " = " << diff << endl;
     return 0;
}

产出

diff = (7 + (0 - 1)) %7 = 3

这不是预期的。为什么?

(两个程序均使用 g++ 9.3.0 在 Ubuntu 64 位上编译)

size_t 的宽度大于 unsigned int 时,就会出现这样的结果。

unsigned intunsigned int 的减法回绕并得到 unsigned int0 - 1 结果为-1,当unsigned int为4字节长时,可能会变成0xffffffff

然后,将其与另一个 unsigned int 相加将得到 unsigned int,因此结果看起来像正常的减法和加法。

另一方面,添加 size_t 将在 size_t 域中计算,因此不会发生截断并且值 7 + 0xffffffff 将被除以而不是 7 - 1.

这是一个示例代码,用于在除法之前检查值:

#include <iostream>
#include <ios>

int main()
{
     const size_t DoW = 7;
     const unsigned int DAYS_OF_WEEK = static_cast<unsigned int> (DoW);
     unsigned int dayOfFirstDay = 0;
     unsigned int _firstDayOfWeek = 1;
     size_t to_add = dayOfFirstDay - _firstDayOfWeek;
     size_t diff_uint = DAYS_OF_WEEK+ (dayOfFirstDay - _firstDayOfWeek);
     size_t diff_sizet = DoW+ (dayOfFirstDay - _firstDayOfWeek);
     std::cout << "sizeof(unsigned int) = " << sizeof(unsigned int) << '\n';
     std::cout << "sizeof(size_t) = " << sizeof(size_t) << '\n';
     std::cout << std::hex;
     std::cout << "to add     : 0x" << to_add << '\n';
     std::cout << "diff_uint  : 0x" << diff_uint << '\n';
     std::cout << "diff_sizet : 0x" << diff_sizet << '\n';
     return 0;
}

这里是an example of output:

sizeof(unsigned int) = 4
sizeof(size_t) = 8
to add     : 0xffffffff
diff_uint  : 0x6
diff_sizet : 0x100000006

dayOfFirstDay - _firstDayOfWeek 是一个 unsigned int。由于 _firstDayOfWeek 大于 dayOfFirstDay,该值是下溢并回绕并成为 unsigned int 的最大值。我们称这个值为 max_uint.

另一方面,DAYS_OF_WEEK 是一个 size_t,它可能比 unsigned int 更宽。这意味着 DAYS_OF_WEEK + max_uint 没有溢出。所以你最终计算 max_uint % 7。但是 max_uint % 7 等于 -1 ...

在您的平台上似乎 size_t 是 64 位,unsigned int 是 32 位。

没有integral promotion到64位1。这就是在表达式中混合 64 位操作数的危险。

所以 -1 的 32 位环绕在转换为 64 位时仍为 4294967295。

我们得到 7 + 4294967295(以 64 位执行)= 4294967302(无环绕)。

4294967302%7=3


1 除了 (unsigned) int 本身是 64 位的系统,目前不太可能。

尝试使用较少的混淆:

#include <stdio.h>
#include <stddef.h>

int main() {
  printf("0u - 1u = %u\n", 0u - 1u);
  printf("7u + (0u - 1u) = %u\n", 7u + (0u - 1u));
  printf("7zu + (0u - 1u) = %zu\n", size_t{7} + (0u - 1u));
}

我得到的输出:

0u - 1u = 4294967295
7u + (0u - 1u) = 6
7zu + (0u - 1u) = 4294967302

如您所见,0u - 1u 导致环绕。将这个巨大的数字添加到 unsigned int 会导致另一个环绕。将它添加到 size_t 并不因为整个值是可表示的。因此,在模数运算符后会得到不同的结果。

在这个声明的初始化表达式中

unsigned int diff = (DAYS_OF_WEEK+ (dayOfFirstDay - _firstDayOfWeek) ) % DAYS_OF_WEEK;

子表达式(dayOfFirstDay - _firstDayOfWeek)等于类型unsigned int的最大值。

因此在这个子表达式中 when DAYS_OF_WEEK 的类型是 unsigned int

(DAYS_OF_WEEK+ (dayOfFirstDay - _firstDayOfWeek) )

发生溢出。

DAYS_OF_WEEK 的类型为 size_t 时都不会发生溢出。

这就是结果不同的原因。