为什么我可以将变量存储在不是其最小对齐值倍数的地址处？

Question

根据this answer：

The minimal alignment is (on a given platform) the one which won't give crashes.

对于GCC 8，有两个函数可以得到最小对齐和首选对齐：

给出最小对齐的标准alignof运算符
提供首选对齐方式的 GNU __alignof__ 函数

对于 double，在 i386 架构上，最小对齐是 4 字节，首选对齐是 8 字节。因此，如果我正确理解我上面引用的答案，一个在地址不是 4 的倍数处存储 double 的应用程序应该会崩溃。

我们来看下面的代码：

#include <iostream>

void f(void* ptr) {
    double* ptr_double = (double*) ptr;
    ptr_double[0] = 3.5;
    std::cout << ptr_double[0] << std::endl;
    std::cout << &ptr_double[0] << std::endl;
}

int main()
{
    alignas(__alignof__(double)) char arr[9];
    f(arr+1);

    return 0;
}

然而，如果我用 -m32 选项编译它，它运行良好，我得到以下结果：

3.5
0xffe41571

我们可以看到我的 double 没有对齐，但是程序运行没有任何问题。

上面引用的下一句是：

On x86-64 it is one byte.

在某些方面，这似乎是真的，因为我的代码有效。但是，在这种情况下，为什么 alignof returns 4?

问题出在哪里？给定的最小对齐定义是否错误？还是有什么我不明白的？

Answer 1

The minimal alignment is (on a given platform) the one which won't give crashes.

Therefore, if I correctly understood the answer I quoted above, an application that store a double at an address that is not a multiple of 4, the program should crashes.

你是denying the antecedent.

仅仅因为符合对齐不会导致崩溃，并不意味着未对齐会导致崩溃。

这是 C++ 标准所说的：

[expr.alignof] An alignof expression yields the alignment requirement of its operand type.

[basic.align] Object types have alignment requirements ([basic.fundamental], [basic.compound]) which place restrictions on the addresses at which an object of that type may be allocated. An alignment is an implementation-defined integer value representing the number of bytes between successive addresses at which a given object can be allocated. An object type imposes an alignment requirement on every object of that type; stricter alignment can be requested using the alignment specifier.

就 C++ 语言而言，没有未对齐的对象，因此它没有指定任何关于它们的行为。你正在做的是访问一个不存在的对象，程序的行为是未定义的。

某些 CPU 架构，尤其是您正在使用的 ¹，在使用未对齐的内存地址时不会崩溃。这样的操作只是或多或少变慢了。

But, in this case, why alignof returns 4?

因为语言实现是这样选择的。大概是因为它比使用 1 或 2 快，但不比使用 8 快。

¹80386的程序员参考手册是这样说的：

Note that words need not be aligned at even-numbered addresses and doublewords need not be aligned at addresses evenly divisible by four. This allows maximum flexibility in data structures (e.g., records containing mixed byte, word, and doubleword items) and efficiency in memory utilization. When used in a configuration with a 32-bit bus, actual transfers of data between processor and memory take place in units of doublewords beginning at addresses evenly divisible by four; however, the processor converts requests for misaligned words or doublewords into the appropriate sequences of requests acceptable to the memory interface. Such misaligned data transfers reduce performance by requiring extra memory cycles. For maximum performance, data structures (including stacks) should be designed in such a way that, whenever possible, word operands are aligned at even addresses and doubleword operands are aligned at addresses evenly divisible by four. Due to instruction prefetching and queuing within the CPU, there is no requirement for instructions to be aligned on word or doubleword boundaries. (However, a slight increase in speed results if the target addresses of control transfers are evenly divisible by four.)

但是，i386 的后继架构引入了需要对齐的矢量扩展。

总而言之：GCC 文档对“最小对齐”的定义与 Starynkevitch 的定义不同。

为什么我可以将变量存储在不是其最小对齐值倍数的地址处？

Why can I stored a variable at an address that is not a multiple of its minimum alignment?

c++

memory-alignment