为什么直接在 C 中将 unsigned 转换为 signed 会给出正确的结果?
Why casting unsigned to signed directly in C gives correct result?
在C语言中,有符号整数和无符号整数在内存中的存储方式不同。当运行时类型明确时,C 也会隐式转换有符号整数和无符号整数。但是,当我尝试以下代码片段时,
#include <stdio.h>
int main() {
unsigned int a = 5;
signed int b = a;
signed int c = *(unsigned int*)&a;
signed int d = *(signed int*)&a;
printf("%u\n", a);
printf("%i\n", b);
printf("%i\n", c);
printf("%i\n", d);
return 0;
}
预期输出为:
5
5 //Implicit conversion occurs
5 //Implicit conversion occurs, because it knows that *(unsigned int*)&a is an unsigned int
[some crazy number] //a is casted directly to signed int without conversion
然而,实际上,它输出
5
5
5
5
为什么?
与评论部分的解释相反,我仍然想争辩说你的整数在内存中的存储方式都是一样的。我很乐意修改我的答案,但此时我仍然不相信 unsigned/signed 整数在内存中的存储方式不同 [实际上,我知道 ^^]。
测试程序:
#include <iostream>
int main() {
unsigned int a = 5;
signed int b = a;
signed int c = *(unsigned int*)&a;
signed int d = *(signed int*)&a;
printf("%u\n", a);
printf("%i\n", b);
printf("%i\n", c);
printf("%i\n", d);
std::terminate();
return 0;
}
编译使用:
g++ -O0 -g test.cpp
运行 它在 GDB 中:
gdb ./a.out
调用std::terminate后,我们可以检查原始内存:
(gdb) print/t main::a
= 101
(gdb) print/t main::b
= 101
(gdb) print/t main::c
= 101
(gdb) print/t main::d
= 101
(gdb)
整数都以相同的方式存储,无论是无符号整数还是有符号整数。唯一的区别是它们的解释方式,一旦 SIGNED_INT_MAX 上的无符号整数被强制转换为有符号整数。然而,这种转换也根本不会改变记忆。
您声称...
In C, signed integer and unsigned integer are stored differently in memory
... 在很大程度上是错误的。标准改为 specifies:
For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; signed char shall not have any padding bits. There shall be exactly one sign bit. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M <= N ). If the sign bit is zero, it shall not affect the resulting value.
(C2011 6.2.6.2/2;已强调)
因此,尽管有符号整数类型及其对应的无符号整数类型(具有相同大小)的表示必须至少在前者具有符号位而后者没有符号位方面有所不同,但表示的大多数位事实上完全对应。标准需要它。小的(大概)非负整数将在相应的有符号和无符号整数类型中以相同的方式表示。
此外,一些评论提出了 "strict aliasing rule" 的问题,即 paragraph 6.5/7 of the standard。它禁止像您的代码那样通过不同类型的左值访问一种类型的对象,但它允许一些值得注意的例外。其中一个例外是您可以通过类型为
的左值访问对象
- a type that is the signed or unsigned type corresponding to the effective type of the object,
这实际上就是您的代码所做的,因此那里没有严格的别名违规。
在C语言中,有符号整数和无符号整数在内存中的存储方式不同。当运行时类型明确时,C 也会隐式转换有符号整数和无符号整数。但是,当我尝试以下代码片段时,
#include <stdio.h>
int main() {
unsigned int a = 5;
signed int b = a;
signed int c = *(unsigned int*)&a;
signed int d = *(signed int*)&a;
printf("%u\n", a);
printf("%i\n", b);
printf("%i\n", c);
printf("%i\n", d);
return 0;
}
预期输出为:
5
5 //Implicit conversion occurs
5 //Implicit conversion occurs, because it knows that *(unsigned int*)&a is an unsigned int
[some crazy number] //a is casted directly to signed int without conversion
然而,实际上,它输出
5
5
5
5
为什么?
与评论部分的解释相反,我仍然想争辩说你的整数在内存中的存储方式都是一样的。我很乐意修改我的答案,但此时我仍然不相信 unsigned/signed 整数在内存中的存储方式不同 [实际上,我知道 ^^]。
测试程序:
#include <iostream>
int main() {
unsigned int a = 5;
signed int b = a;
signed int c = *(unsigned int*)&a;
signed int d = *(signed int*)&a;
printf("%u\n", a);
printf("%i\n", b);
printf("%i\n", c);
printf("%i\n", d);
std::terminate();
return 0;
}
编译使用: g++ -O0 -g test.cpp
运行 它在 GDB 中: gdb ./a.out
调用std::terminate后,我们可以检查原始内存:
(gdb) print/t main::a
= 101
(gdb) print/t main::b
= 101
(gdb) print/t main::c
= 101
(gdb) print/t main::d
= 101
(gdb)
整数都以相同的方式存储,无论是无符号整数还是有符号整数。唯一的区别是它们的解释方式,一旦 SIGNED_INT_MAX 上的无符号整数被强制转换为有符号整数。然而,这种转换也根本不会改变记忆。
您声称...
In C, signed integer and unsigned integer are stored differently in memory
... 在很大程度上是错误的。标准改为 specifies:
For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; signed char shall not have any padding bits. There shall be exactly one sign bit. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M <= N ). If the sign bit is zero, it shall not affect the resulting value.
(C2011 6.2.6.2/2;已强调)
因此,尽管有符号整数类型及其对应的无符号整数类型(具有相同大小)的表示必须至少在前者具有符号位而后者没有符号位方面有所不同,但表示的大多数位事实上完全对应。标准需要它。小的(大概)非负整数将在相应的有符号和无符号整数类型中以相同的方式表示。
此外,一些评论提出了 "strict aliasing rule" 的问题,即 paragraph 6.5/7 of the standard。它禁止像您的代码那样通过不同类型的左值访问一种类型的对象,但它允许一些值得注意的例外。其中一个例外是您可以通过类型为
的左值访问对象
- a type that is the signed or unsigned type corresponding to the effective type of the object,
这实际上就是您的代码所做的,因此那里没有严格的别名违规。