`offsetof(struct Derived, super.x) == offsetof(struct Base, x)` 在 C 中是否成立?

Does `offsetof(struct Derived, super.x) == offsetof(struct Base, x)` hold true in C?

我不确定 André Caron 在这里的意思:

Virtual functions in C

... some of this code relies on (officially) non-standard behavior that "just happens" to work on most compilers. The main issue is that the code assumes that &m.base == &m (e.g. the offset of the base member is 0). If that is not the case, then the cast in custom_bar() results in undefined behavior. To work around this issue, you can add an extra pointer in struct foo as such:

mstruct meh * 类型。 struct foo * 类型的对象 f 通过强制转换为 struct meh * 分配给 m。 struct meh 具有类型 struct foo (struct foo meh::base = foo::bar) 的成员 base。 为什么据说不能保证 &m.base == &m? 如果结构不是 POD,我可以看到这个。安德烈也暗示了这一点。但是,为什么一个POD结构必须有另一个指针void *foo::hook

struct meh * m = (struct meh*)f; 变为 struct meh * m = (struct meh*)f->hook;。 在他将hook分配给m->base.hook = m;之后。

struct meh
{
   /* inherit from "class foo". MUST be first. */
   struct foo base;
   int more_data;
};

下面,我列出了我研究中的相关 ISO C90/C++98 摘录。我还创建了一个代码示例。 示例代码可以通过 -fsanitize=undefined -std=c++98 -O0 -Wall -Wextra -Wpedantic -Wconversion -Wundef.

使用 Clang 编译

这里是:

https://godbolt.org/z/qo9f8KnYM

节选

来自 ISO C90 (ANSI C89):

An object shall have its stored value accessed only by an lvalue that has one of the following types: /28/

...

  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a
    subaggregate or contained union), or

A pointer to a structure object, suitably cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may therefore be unnamed holes within a structure object, but not at its beginning, as necessary to achieve the appropriate alignment.

来自 ISO C++98:

16 If a POD-union contains two or more POD-structs that share a common initial sequence, and if the POD- union object currently contains one of these POD-structs, it is permitted to inspect the common initial part of any of them. Two POD-structs share a common initial sequence if corresponding members have layout- compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members. 17 A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [Note: There might therefore be unnamed padding within a POD-struct object, but not at its beginning, as necessary to achieve appropriate alignment. ]

代码示例

#include <iostream>

struct A {
  int m1;
};

struct B {
  int m1;
  int m2;
};

struct C {
  struct A super;
  int m3;
};

int main(void) {
  struct A a = {42};
  struct C c = {{666}, 1984};

  // Access A::m1 through pointer of type B
  std::cout << ((B *)&a)->m1 << std::endl; // 42

  // Access A::m1 through pointer of type C
  std::cout << ((C *)&a)->super.m1 << std::endl; // 42

  // Access C::super::A::m1 through pointer of type A.
  std::cout << ((A *)(&c))->m1 << std::endl; // 666

  return 0;
}

编辑 1:让我在这个编辑部分重写这个问题。 我会忽略 C++,因为评论中的人告诉我不要使问题复杂化。如果此编辑比原来的更有帮助,那么也许您可以考虑用此编辑替换原来的 post。 或者我或其他人可以只“删除”原来的那个。 或者,如果您对如何改进我的问题有更好的想法,请告诉我。 (我可能会补充说,我的注意力有问题,很容易迷失在细节中……我会把它留在那儿。你可能已经猜到它是什么了……) 如果我的第二次尝试仍然失败,那么也许我应该把我未能提出一个明确的问题作为一个提示,如果适用的话,下次再考虑并写下来。 事不宜迟,这是我第二次尝试提出这个问题:

我指的是 post 此处的答案:

Virtual functions in C

  struct Base {
    int x;
  };

  struct Derived {
    struct Base super;
  };

如果offsetof(struct Derived, super) == 0offsetof(struct Base, x) == 0,我们是否可以暗示offsetof(struct Derived, super.x) == offsetof(struct Base, x)

André Caron 建议使用指向派生对象的额外指针。 显然,依赖 offsetof(struct Derived, super.x) == offsetof(struct Base, x).

是不够的或可移植的

Even though this works, you are relying on compiler extensions for type punning that can lead to undefined behavior blablabla. This works in GCC and MSVC for a fact.

Indeed the alignment stuff relies on compiler extensions. You can make it portable using an extra void* pointer in struct foo that points to the "derived object". However, the technique is sufficiently popular in well-known libraries to be considered "portable". Any compiler that made this type of code break would have lots of complaints from its customers.

我很难理解为什么 offsetof(struct Derived, super.x) != offsetof(struct Base, x) 可能是这种情况。 我没有在 C 标准中找到说明。因此,我正在寻求进一步的澄清。

13:26,重申我的假设:

假设offsetof(struct Derived, super.x) != offsetof(struct Base, x)

  struct Base {
    int x;
    void *hook;
  };

  struct Derived {
    struct Base super;
  };

根据上述假设,考虑:

  struct Base base = {42};
  struct Derived derived;
  base.hook = &base; /* Assuming offsetof(struct Base, x) == 0 */
  derived.super = base;

(struct Base*)(derived.super.hook) == &base 应为真。

#include <stddef.h>
#include <stdio.h>

struct Base {
  int x;
  void *hook;
};

struct Derived {
  struct Base super;
};

int main(void) {
  struct Base base = {42};
  struct Derived derived;
  base.hook = &base; /* Assuming offsetof(struct Base, x) == 0 */
  derived.super = base;

  printf("Offset Base x: %lu\n", offsetof(struct Base, x));
  printf("Offset Derived super: %lu\n", offsetof(struct Derived, super));
  printf("Offset Derived super.x: %lu\n", offsetof(struct Derived, super.x));
  printf("Offset Derived super.hook: %lu\n",
         offsetof(struct Derived, super.hook));
  printf("derived.super.hook == &base, yields %d",
         (struct Base *)(derived.super.hook) == &base);

  return 0;
}

However, why is it necessary for a POD structure to have another pointer void *foo::hook?

没有必要。来自原问答:

This technique is more reliable, especially if you plan to write the "derived struct" in C++ and use virtual functions. In that case, the offset of the first member is often non-0 as compilers store run-time type information and the class' v-table there.

具有虚函数的 c++ struct/class 不是 POD。任何非 POD structure/class 的数据成员都可以有一个非 0 的偏移量,这就是 hook 可以处理的情况。