我可以为一个联合成员分配一个值并从另一个成员读取相同的值吗?

Can I assign a value to one union member and read the same value from another?

基本上,我有一个

struct foo {
        /* variable denoting active member of union */
        enum whichmember w;
        union {
                struct some_struct my_struct;
                struct some_struct2 my_struct2;
                struct some_struct3 my_struct3;
                /* let's say that my_struct is the largest member */
        };
};

main()
{
        /*...*/
        /* earlier in main, we get some struct foo d with an */
        /* unknown union assignment; d.w is correct, however */
        struct foo f;
        f.my_struct = d.my_struct; /* mystruct isn't necessarily the */
                                /* active member, but is the biggest */
        f.w = d.w;
        /* code that determines which member is active through f.w */
        /* ... */
        /* we then access the *correct* member that we just found */
        /* say, f.my_struct3 */

        f.my_struct3.some_member_not_in_mystruct = /* something */;
}

Accessing C union members via pointers 好像说通过指针访问成员是可以的。 看评论。

但我的问题涉及直接访问它们。基本上,如果我将我需要的所有信息写入联合体的最大成员并手动跟踪类型,那么每次访问手动指定的成员是否仍会产生正确的信息?

是的,您可以直接访问它们。您可以将值分配给联合成员并通过不同的联合成员读回。结果将是确定的和正确的。

是的,您的代码可以工作,因为使用联合,编译器将为所有元素共享相同的内存 space。

例如如果: &f.mystruct = 100 然后 &f.mystruct2 = 100 和 &f.mystruct3 = 100

如果 mystruct 是最大的,那么它将一直有效。

我注意到问题中的代码使用了匿名联合,也就是说它必须是为C11编写的;匿名联合不是 C90 或 C99 的一部分。

ISO/IEC 9899:2011,当前的 C11 标准,是这样说的:

§6.5.2.3 Structure and union members

¶3 A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member,95) and is an lvalue if the first expression is an lvalue. If the first expression has qualified type, the result has the so-qualified version of the type of the designated member.

¶4 A postfix expression followed by the -> operator and an identifier designates a member of a structure or union object. The value is that of the named member of the object to which the first expression points, and is an lvalue.96) If the first expression is a pointer to a qualified type, the result has the so-qualified version of the type of the designated member.

¶5 …

¶6 One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.


95) If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.

96) If &E is a valid pointer expression (where & is the ‘‘address-of’’ operator, which generates a pointer to its operand), the expression (&E)->MOS is the same as E.MOS.

斜体 与标准一样

而第 6.2.6 节 类型的表示 说(部分):

§6.2.6.1 General

¶6 When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.51) The value of a structure or union object is never a trap representation, even though the value of a member of the structure or union object may be a trap representation.

¶7 When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values.


51) Thus, for example, structure assignment need not copy any padding bits.


我对你所做的事情的解释是脚注 51 说“它可能不起作用”,因为你可能只分配了结构的一部分。你充其量只是如履薄冰。但是,与此相反,您规定分配的结构(在 f.my_struct = d.my_struct; 分配中)是最大的成员。它不会出错的可能性相当高,但如果两个结构中的填充字节(在联合的活跃成员和联合的最大成员中)位于不同的地方,那么事情可能会出错并且如果您向编译器作者报告问题,编译器作者只会对您说“不要违反标准”。

所以,就我是一名语言律师而言,这位语言律师的回答是“不能保证”。在实践中,您不太可能 运行 遇到问题,但可能性是存在的,而且您无法挽回任何人。

为了确保您的代码安全,只需将 f = d; 与联合赋值一起使用。


示例

假设机器要求 double 在 8 字节边界上对齐并且 sizeof(double) == 8,那么 int 必须在 4 字节边界上对齐并且 sizeof(int) == 4 ,并且 short 必须在 2 字节边界和 sizeof(short) == 2 上对齐)。这是一组看似合理甚至常见的尺寸和对齐要求。

此外,假设您有问题中结构的双结构并集变体:

struct Type_A { char x; double y; };
struct Type_B { int a; short b; short c; };
enum whichmember { TYPE_A, TYPE_B };

struct foo
{
    enum whichmember w;
    union
    {
        struct Type_A s1;
        struct Type_B s2;
    };
};

现在,在指定的大小和对齐方式下,struct Type_A 将占用 16 个字节,而 struct Type_B 将占用 8 个字节,因此联合也将使用 16 个字节。联盟的布局将是这样的:

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| x | p...a...d...d...i...n...g |               y               |  s1
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|       a       |   b   |   c   |   p...a...d...d...i...n...g   |  s2
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

w元素也意味着在struct foo中有8个字节在(匿名)union之前,其中w很可能只占4个。大小因此 struct foo 在这台机器上是 24。不过,这与讨论并不特别相关。

现在假设我们有这样的代码:

struct foo d;
d.w = TYPE_B;
d.s2.a = 1234;
d.s2.b = 56;
d.s2.c = 78;

struct foo f;
f.s1 = d.s1;
f.w  = TYPE_B;

现在,根据脚注 51 的规定,结构赋值 f.s1 = d.s1; 不必复制填充位。我知道没有编译器有这样的行为,但标准说编译器不需要复制填充位。这意味着 f.s1 的值可能是:

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| x | g...a...r...b...a...g...e |   r...u...b...b...i...s...h   |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

垃圾是因为不需要复制这 7 个字节(脚注 51 说这是一个选项,尽管它不太可能成为任何当前编译器执行的选项)。垃圾是因为 d 的初始化从未在这些字节中设置任何值;该结构部分的内容未指定。

如果您现在继续尝试将 f 视为 d 的副本,您可能会有点惊讶地发现 [=39] 的 8 个相关字节中只有 1 个字节=]实际上已经初始化了。

我再强调一下:我知道没有编译器会这样做。但是问题被标记为 'language lawyer' 所以问题是 'what does the language standard state' 这是我对标准引用部分的解释。