编译时 C 数组的严格别名和引用

Strict aliasing and references to compile-time C arrays

给定以下代码

#include <cassert>
#include <climits>
#include <cstdint>
#include <iostream>

static_assert(CHAR_BIT == 8, "A byte does not consist of 8 bits");

void func1(const int32_t& i)
{
    const unsigned char* j = reinterpret_cast<const unsigned char*>(&i);
    for(int k = 0; k < 4; ++k)
        std::cout << static_cast<int>(j[k]) << ' ';
    std::cout << '\n';
}

void func2(const int32_t& i)
{
    const unsigned char (&j)[4] = reinterpret_cast<const unsigned char (&)[4]>(i);
    for(int k = 0; k < 4; ++k)
        std::cout << static_cast<int>(j[k]) << ' ';
    std::cout << '\n';
}

int main() {
    func1(-1);
    func2(-1);
}

从语言规则可以清楚地看出 func1 很好,因为指向 unsigned char 的指针可以作为任何其他类型的别名。我的问题是:这是否扩展到对已知长度的 C 数组的 C++ 引用?直觉上我会说是的。 func2 是定义明确还是会触发未定义的行为?

我已经尝试使用 Clang 和 GCC 以及 -Wextra -Wall -Wpedantic 和 UBSAN 的每种可能组合来编译上述代码,并且没有收到任何警告并且输出始终相同。这显然并没有说明没有 UB,但我无法触发任何常见的严格别名类型优化错误。

这是未定义的行为。

关于reinterpret_cast的含义这里有[expr.reinterpret.cast]

11 A glvalue expression of type T1 can be cast to the type “reference to T2” if an expression of type “pointer to T1” can be explicitly converted to the type “pointer to T2” using a reinterpret_­cast. The result refers to the same object as the source glvalue, but with the specified type. [ Note: That is, for lvalues, a reference cast reinterpret_­cast(x) has the same effect as the conversion *reinterpret_­cast(&x) with the built-in & and * operators (and similarly for reinterpret_­cast(x)).  — end note ] No temporary is created, no copy is made, and constructors or conversion functions are not called.

这告诉我们只要 reinterpret_cast<const unsigned char (*)[4]>(&i) 有效,转换 int func2 就有效。这里没有震惊。但问题的症结在于,您可能无法从该指针转换中获得任何有意义的信息。关于这个主题,我们在 [basic.compound]:

4 Two objects a and b are pointer-interconvertible if:

  • they are the same object, or
  • one is a standard-layout union object and the other is a non-static data member of that object ([class.union]), or
  • one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, the first base class subobject of that object ([class.mem]), or
  • there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.

If two objects are pointer-interconvertible, then they have the same address, and it is possible to obtain a pointer to one from a pointer to the other via a reinterpret_­cast. [ Note: An array object and its first element are not pointer-interconvertible, even though they have the same address.  — end note ]

这是有意义的指针转换的详尽列表。所以我们不允许获得这样的数组地址,因此它不是有效的数组泛左值。因此,您对转换结果的进一步使用是不确定的。