C 中的严格别名规则

Question

我正在尝试理解 6.5(p6) 中定义的严格别名规则：

If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value.

和6.5(p7):

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:88)

— a type compatible the effective type of the object

考虑以下示例：

struct test_internal_struct_t{
    int a;
    int b;
};

struct test_struct_t{
    struct test_internal_struct_t tis;
};

int main(){
    //alocated object, no declared type
    struct test_struct_t *test_struct_ptr = malloc(sizeof(*test_struct_ptr)); 

    //object designated by the lvalue has type int
    test_struct_ptr->tis.a = 1; 

    //object designated by the lvalue has type int
    test_struct_ptr->tis.b = 2; 

    //VIOLATION OF STRICT ALIASING RULE???
    struct test_internal_struct_t tis = test_struct_ptr->tis; 
    return 0;
}

malloc(sizeof(*test_struct_ptr)) 没有声明类型，因为它是分配的，如脚注 87:

87) Allocated objects have no declared type

通过test_struct_ptr->tis.a和test_struct_ptr->tis.b访问的对象有效类型为int。但是对象 test_struct_ptr->tis 没有有效类型，因为它被分配了。

问题： struct test_internal_struct_t tis = test_struct_ptr->tis; 是否违反了严格的别名？ test_struct_ptr->tis 指定的对象没有有效类型，但 lvalue 具有类型 struct test_internal_struct_t.

Answer 1

C 2018 6.5 6 定义了有效类型 使用短语“存储...通过左值”但从未定义该短语：

If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value.

所以就留给读者去解读吧。考虑这段代码：

struct a { int x; };
struct b { int x; };

void copy(int N, struct a *A, struct b *B)
{
    for (int n = 0; n < N; ++n)
        A[n].x = B[n].x;
}

如果编译器知道各种对象 A[n] 不与各种对象 B[n] 重叠，那么它可以通过在一条指令中执行多个 B[n] 的加载来优化此代码（例如 AVX 或其他单指令多数据 [SIMD] 指令）并在一条指令中存储多个 A[n]。（这可能需要额外的代码来处理循环片段和对齐问题。我们在这里不关心这些。）如果有可能某些 A[n]->x 可能引用与 B[n]->x 相同的对象以获得不同的值n，则编译器可能不会使用此类多元素加载和存储，因为它可能会改变程序的可观察行为。例如，如果情况是内存包含十个 int，值从 0 到 9，并且 B 指向 0 而 A 指向 2：

B   A
0 1 2 3 4 5 6 7 8 9

然后写的循环，给定 N = 4，必须一次复制一个元素，产生：

0 1 0 1 0 1 6 7 8 9

如果编译器将其优化为四元素加载和存储，它可以加载 0 1 2 3 然后存储 0 1 2 3，生成：

0 1 0 1 2 3 6 7 8 9

然而，C 告诉我们 struct a 和 struct b 是不兼容的，即使它们的布局相同。当类型 X 和 Y 不兼容时，它告诉我们 X 不是 Y。类型系统的一个重要目的是区分对象类型。

现在考虑表达式 A[n]->x = B[n]->x。在这：

A[n] 是 struct a.
由于 A[n] 是 . 的左操作数，因此不会转换为值。
A[n].x 指定并且是 A[n] 的成员 x 的左值。
右操作数的值替换A[n].x中的值。

因此，对存储值的对象的直接访问仅在作为成员 A[n].x 的 int 中。左值 A[n] 出现在表达式中，但它不是直接用于存储值的左值。 &A[n]处内存的有效类型是什么？

如果我们将此内存解释为仅仅是一个 int，那么对对象访问的唯一限制是所有 B[n].x 都是 int 而所有 A[n].x是 int，因此部分或全部 A[n].x 可能会访问与部分或全部 B[n].x 相同的内存，并且不允许编译器进行上述优化。

这不符合类型系统区分struct a和struct b的目的，因此它不是一个正确的解释。要实现预期的优化，必须是 A[n].x 存储的内存包含 struct a 个对象，并且 B[n].x 访问的内存包含 struct b 个对象。

因此，“通过左值存储……”必须包括这样的表达式，其中左值用于派生结构成员，但其本身不是用于访问的最终左值。

C 中的严格别名规则

Strict aliasing rule in C

c

strict-aliasing

language-lawyer