int 在内存中的表示

Representation of int in memory

在 int 在内存中使用多个字节表示的体系结构上，C 标准对可能的表示施加了哪些限制？大多数当前系统使用小端或大端表示，但是有可能有一个具有不同表示的一致系统吗？它能有多大不同？

what constraints does the C Standard impose regarding possible representations?

3 种允许的编码：2 的补码、1 的补码、符号大小。非 2 的补码可以有 -0 或陷阱表示。

int 必须为 16 位或更宽（范围至少 [-32767...32767]）。对于真实的历史示例，可能是 36 或 64。

but it is possible to have a conforming system with a different representation?

示例：PDP-endian

0x01020304 存储为 2、1、4、3。另请参阅。

How different can it be?

int 可能有填充，char 不能。我不知道有任何 int 带有填充。

当“字节”超过 16 位时，

int 可能是 1 个“字节”。
IIRC，一些图形处理器使用 64 位“字节”，char，int，long，long long。

我曾经使用过 64 位 long, unsigned long，其中 unsigned long 有 1 个填充位，因此 ULONG_MAX == LONG_MAX。合规但不寻常。理论上，UINT_MAX == INT_MAX 是可能的——从未听说过这样的实现。

2020年，我怀疑以下是普遍的。

Endian：大或小。
2的补码。（下一个 C 可能需要这个。）
“字节大小”为 8（可能 16、32），int 是 16 位或 32 位。
无填充。

从标准的以下引用中，我们看到：

int 至少有 16 位。
任何字节顺序都是允许的。
允许任何位顺序（但必须匹配 unsigned int）。
值位是二进制的。
负值使用三种指定方法之一。

C 2018 6.2.6.1 说：

1 The representations of all types are unspecified except as stated in this subclause.

2 Except for bit-fields, objects are composed of contiguous sequences of one or more bytes, the number, order, and encoding of which are either explicitly specified or implementation-defined.

4 Values stored in non-bit-field objects of any other object type [other than unsigned bit-fields and unsigned char, addressed in paragraph 3] consist of n × CHAR_BIT bits, where n is the size of an object of that type, in bytes…

6.2.6.2 说：

1 For unsigned integer types other than unsigned char,… If there are N value bits, each bit shall represent a different power of 2 between 1 and 2^N-1, so that objects of that type shall be capable of representing values from 0 to 2^N − 1 using a pure binary representation;…

2 For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; signed char shall not have any padding bits. There shall be exactly one sign bit. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M ≤ N ). If the sign bit is zero, it shall not affect the resulting value. If the sign bit is one, the value shall be modified in one of the following ways:

— the corresponding value with sign bit 0 is negated (sign and magnitude);

— the sign bit has the value −(2^M ) (two’s complement);

— the sign bit has the value −(2^M − 1) (ones’ complement).

Which of these applies is implementation-defined, as is whether the value with sign bit 1 and all value bits zero (for the first two), or with sign bit and all value bits 1 (for ones’ complement), is a trap representation or a normal value. In the case of sign and magnitude and ones’ complement, if this representation is a normal value it is called a negative zero.

5 The values of any padding bits are unspecified… For any integer type, the object representation where all the bits are zero shall be a representation of the value zero in that type.

而 5.2.4.2.1 告诉我们 int 必须至少能够表示 −32767 到 +32767，由此我们推断它至少有 15 个值位。

int 在内存中的表示

Representation of int in memory

c

memory

endianness

language-lawyer