C 编译器使用的数据布局（对齐概念）

Question

以下是红龙书的节选

Example 7.3. Figure 7.9 is a simplification of the data layout used by C compilers for two machines that we call Machine 1 and Machine 2.

Machine 1 : The memory of Machine 1 is organized into bytes consisting of 8 bits each. Even though every byte has an address, the instruction set favors short integers being positioned at bytes whose addresses are even, and integers being positioned at addresses that are divisible by 4. The compiler places short integers at even addresses, even if it has to skip a byte as padding in the process. Thus, four bytes, consisting of 32 bits, may be allocated for a character followed by a short integer.

Machine 2: each word consists of 64 bits, and 24 bits are allowed for the address of a word. There are 64 possibilities for the individual bits inside a word, so 6 additional bits are needed to distinguish between them. By design, a pointer to a character on Machine 2 takes 30 bits — 24 to find the word and 6 for the position of the character inside the word. The strong word orientation of the instruction set of Machine 2 has led the compiler to allocate a complete word at a time, even when fewer bits would suffice to represent all possible values of that type; e.g., only 8 bits are needed to represent a character. Hence, under alignment, Fig. 7.9 shows 64 bits for each type. Within each word, the bits for each basic type are in specified positions. Two words consisting of 128 bits would be allocated for a character followed by a short integer, with the character using only 8 of the bits in the first word and the short integer using only 24 of the bits in the second word. □

我发现了对齐的概念 ,here and 。我能从他们那里了解到的是：在 word addressable CPUs（其中大小超过一个字节）中，在数据对象中引入了某些填充，这样 CPU 可以有效地检索数据从内存中最小编号。内存周期。

现在这里的Machine 1实际上是一个字节地址。 Machine 1 规范中的条件可能比字大小为 4 字节的简单字可寻址机器更难。在这样一个 64 位机中，我们需要确保我们的数据项只是字对齐，没有更多的困难。但是如何在像 Machine 1（如上面 table 中给出的）这样的系统中找到对齐方式，其中字对齐的简单概念不起作用，因为它是字节可寻址的并且具有更困难的规范。

此外，我觉得很奇怪，在 double 的行中，类型的大小超过了对齐字段中给出的大小。不应该 alignment(in bits) ≥ size (in bits) 吗？因为对齐指的是实际为数据对象分配的内存（？）。

”每个字由64位组成，一个字的地址允许24位。每个位有64种可能性在一个单词中，因此需要 6 额外的位来区分它们。根据设计，指向 Machine 2 上的字符的指针需要 30 位 — 24 来查找单词6 表示字符在单词中的位置。“ - 此外，关于指针概念的陈述，基于对齐方式，应该如何可视化（2^6 = 64，它很好，但这 6 位与对齐概念有什么关系）

Answer 1

首先，机器 1 一点也不特别 - 它就像 x86-32 或 32 位 ARM。

Moreover I find it quite weird that in the row for double the size of the type is more than what is given in the alignment field. Shouldn't alignment(in bits) ≥ size (in bits) ? Because alignment refers to the memory actually allocated for the data object (?).

不，这不是真的。对齐意味着对象中最低可寻址字节的地址必须能被给定的字节数整除。

此外，对于 C，arrays sizeof (ElementType) 也确实需要 大于或等于到每个成员的对齐和sizeof (ElementType)可以被对齐整除，因此脚注a。因此在后一台计算机上：

 struct { char a, b; }

可能有 sizeof 16 因为字符在不同的可寻址词中，而

struct { char a[2];  }

可以压缩成8个字节

how should this statement about the concept of the pointers, based on alignment is to be visualized (2^6 = 64, it is fine but how is this 6 bits correlating with the alignment concept)

至于字符指针，6位是假的。需要 3 位来选择 8-byte 字中的 8 字节之一，因此这是书中的一个错误。普通字节 select 只是一个 24 位的字，字符（字节）指针 select 24 位的字，字内的 8 位字节之一 3 位。

C 编译器使用的数据布局（对齐概念）

Data layouts used by C compilers (the alignment concept)

c

compiler-construction

local

runtime-environment

memory-alignment