使用联合在具有公共初始序列的两个结构之间进行转换是否合法且定义明确的行为(参见示例)?

Is it legal and well defined behavior to use a union for conversion between two structs with a common initial sequence (see example)?

我有一个 API,它有一个面向公众的结构 A 和一个内部结构 B,需要能够将结构 B 转换为结构 A。以下代码是否合法 并且C99(和 VS 2010/C89)和 C++03/C++11 中明确定义的行为?如果是,请解释是什么让它定义明确。如果不是,那么在两个结构之间进行转换的最有效和跨平台的方法是什么?

struct A {
  uint32_t x;
  uint32_t y;
  uint32_t z;
};

struct B {
  uint32_t x;
  uint32_t y;
  uint32_t z;
  uint64_t c;
};

union U {
  struct A a;
  struct B b;
};

int main(int argc, char* argv[]) {
  U u;
  u.b.x = 1;
  u.b.y = 2;
  u.b.z = 3;
  u.b.c = 64;

  /* Is it legal and well defined behavior when accessing the non-write member of a union in this case? */
  DoSomething(u.a.x, u.a.y, u.a.z);

  return 0;
}


更新

我简化了示例并编写了两个不同的应用程序。一个基于 memcpy,另一个使用 union。


联盟:

struct A {
  int x;
  int y;
  int z;
};

struct B {
  int x;
  int y;
  int z;
  long c;
};

union U {
  struct A a;
  struct B b;
};

int main(int argc, char* argv[]) {
  U u;
  u.b.x = 1;
  u.b.y = 2;
  u.b.z = 3;
  u.b.c = 64;
  const A* a = &u.a;
  return 0;
}


memcpy:

#include <string.h>

struct A {
  int x;
  int y;
  int z;
};

struct B {
  int x;
  int y;
  int z;
  long c;
};

int main(int argc, char* argv[]) {
  B b;
  b.x = 1;
  b.y = 2;
  b.z = 3;
  b.c = 64;
  A a;
  memcpy(&a, &b, sizeof(a));
  return 0;
}



分析程序集 [DEBUG](Xcode 6.4,默认 C++ 编译器):

这里是调试模式下汇编的相关差异。当我分析发布版本时,程序集没有区别。


联盟:

movq     %rcx, -48(%rbp)


记忆:

movq    -40(%rbp), %rsi
movq    %rsi, -56(%rbp)
movl    -32(%rbp), %edi
movl    %edi, -48(%rbp)



警告:

基于 union 的示例代码会生成有关未使用变量 'a' 的警告。由于profiled assembly来自debug,不知道有没有影响。

这很好,因为您正在访问的成员是公共初始序列的元素。

C11 (6.5.2.3 结构和联合成员;语义):

[...] if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

C++03 ([class.mem]/16):

If a POD-union contains two or more POD-structs that share a common initial sequence, and if the POD-union object currently contains one of these POD-structs, it is permitted to inspect the common initial part of any of them. Two POD-structs share a common initial sequence if corresponding members have layout-compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

这两个标准的其他版本具有相似的语言;从 C++11 开始,使用的术语是 standard-layout 而不是 POD


我认为可能会出现混淆,因为 C 允许通过联合 类型双关(别名不同类型的成员),而 C++ 不允许;这是确保 C/C++ 兼容性的主要情况,您必须使用 memcpy。但是在您的情况下,您正在访问的元素具有 相同的 类型,并且前面有兼容类型的成员,因此类型双关规则不相关。

在 C 和 C++ 中都是合法的

例如,在 C99 (6.5.2.3/5) 和 C11 (6.5.2.3/6) 中:

One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the complete type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

C++11和C++14中也有类似的规定(措辞不同,含义相同)。