为什么直接在 C 中将 unsigned 转换为 signed 会给出正确的结果？

Question

在C语言中，有符号整数和无符号整数在内存中的存储方式不同。当运行时类型明确时，C 也会隐式转换有符号整数和无符号整数。但是，当我尝试以下代码片段时，

#include <stdio.h>

int main() {    
    unsigned int a = 5;
    signed int b = a;
    signed int c = *(unsigned int*)&a;
    signed int d = *(signed int*)&a;

    printf("%u\n", a);
    printf("%i\n", b);
    printf("%i\n", c);
    printf("%i\n", d);

    return 0;
}

预期输出为：

5
5                   //Implicit conversion occurs
5                   //Implicit conversion occurs, because it knows that *(unsigned int*)&a is an unsigned int
[some crazy number] //a is casted directly to signed int without conversion

然而，实际上，它输出

为什么？

Answer 1

与评论部分的解释相反，我仍然想争辩说你的整数在内存中的存储方式都是一样的。我很乐意修改我的答案，但此时我仍然不相信 unsigned/signed 整数在内存中的存储方式不同 [实际上，我知道 ^^]。

测试程序：

#include <iostream>

int main() {    
    unsigned int a = 5;
    signed int b = a;
    signed int c = *(unsigned int*)&a;
    signed int d = *(signed int*)&a;

    printf("%u\n", a);
    printf("%i\n", b);
    printf("%i\n", c);
    printf("%i\n", d);
    std::terminate();

    return 0;
}

编译使用： g++ -O0 -g test.cpp

运行它在 GDB 中： gdb ./a.out

调用std::terminate后，我们可以检查原始内存：

(gdb) print/t main::a
 = 101
(gdb) print/t main::b
 = 101
(gdb) print/t main::c
 = 101
(gdb) print/t main::d
 = 101
(gdb)

整数都以相同的方式存储，无论是无符号整数还是有符号整数。唯一的区别是它们的解释方式，一旦 SIGNED_INT_MAX 上的无符号整数被强制转换为有符号整数。然而，这种转换也根本不会改变记忆。

Answer 2

您声称...

In C, signed integer and unsigned integer are stored differently in memory

... 在很大程度上是错误的。标准改为 specifies:

For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; signed char shall not have any padding bits. There shall be exactly one sign bit. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M <= N ). If the sign bit is zero, it shall not affect the resulting value.

（C2011 6.2.6.2/2；已强调）

因此，尽管有符号整数类型及其对应的无符号整数类型（具有相同大小）的表示必须至少在前者具有符号位而后者没有符号位方面有所不同，但表示的大多数位事实上完全对应。标准需要它。小的（大概）非负整数将在相应的有符号和无符号整数类型中以相同的方式表示。

此外，一些评论提出了 "strict aliasing rule" 的问题，即 paragraph 6.5/7 of the standard。它禁止像您的代码那样通过不同类型的左值访问一种类型的对象，但它允许一些值得注意的例外。其中一个例外是您可以通过类型为

的左值访问对象

a type that is the signed or unsigned type corresponding to the effective type of the object,

这实际上就是您的代码所做的，因此那里没有严格的别名违规。

为什么直接在 C 中将 unsigned 转换为 signed 会给出正确的结果？

Why casting unsigned to signed directly in C gives correct result?

c

c++

unsigned-integer