具有外部说明符的全局变量的循环依赖

Question

可以使用 extern 存储 class 说明符在不定义的情况下声明全局变量。所以我相信可以为全局变量引入循环依赖，就像如何使用前向声明使 classes/modules 相互依赖一样。链接器如何处理变量定义之间的这种依赖关系？这种做法会产生未定义的行为吗？

//source2.cpp

extern int b;
int a = b + 1;

//source1.cpp

#include<iostream>

extern int a;
int b = a + 1;

int main() {
    std::cout << a << " " << b <<std::endl;
}

甚至，

#include<iostream>

extern int a;
int b = a + 1;
int a = b + 1;

int main() {
    std::cout << a << " " << b <<std::endl;
}

两者都打印出 2 1。怎么了？我猜链接器将外部符号 int a 解析为值为 0。但是它是如何决定外部符号求解完成的，而不是永远停留在递归搜索变量定义中？

Answer 1

标准是这么说的：

Variables with static storage duration are initialized as a consequence of program initiation. Variables with thread storage duration are initialized as a consequence of thread execution. Within each of these phases of initiation, initialization occurs as follows.

[...] Constant initialization is performed if a variable or temporary object with static or thread storage duration is initialized by a constant initializer for the entity. If constant initialization is not performed, a variable with static storage duration (6.7.1) or thread storage duration (6.7.2) is zero-initialized (11.6). Together, zero-initialization and constant initialization are called static initialization; all other initialization is dynamic initialization. All static initialization strongly happens before (4.7.1) any dynamic initialization. [ Note: The dynamic initialization of non-local variables is described in 6.6.3; that of local static variables is described in 9.7. —end note ]

An implementation is permitted to perform the initialization of a variable with static or thread storage duration as a static initialization even if such initialization is not required to be done statically, provided that

the dynamic version of the initialization does not change the value of any other object of static or thread storage duration prior to its initialization, and

the static version of the initialization produces the same value in the initialized variable as would be produced by the dynamic initialization if all variables not required to be initialized statically were initialized dynamically.

[ Note: As a consequence, if the initialization of an object obj1 refers to an object obj2 of namespace scope potentially requiring dynamic initialization and defined later in the same translation unit, it is unspecified whether the value of obj2 used will be the value of the fully initialized obj2 (because obj2 was statically initialized) or will be the value of obj2 merely zero-initialized. For example,
inline double fd() { return 1.0; }
extern double d1;
double d2 = d1;    // unspecified:
                   // may be statically initialized to 0.0 or
                   // dynamically initialized to 0.0 if d1 is
                   // dynamically initialized, or 1.0 otherwise
double d1 = fd();  // may be initialized statically or dynamically to 1.0
—end note ]

[...]

If [some conditions] V is defined before W within a single translation unit, the [dynamic] initialization of V is sequenced before the initialization of W.

从概念上讲，静态初始化是在翻译时执行的：编译器发出一个符号，其值是已经初始化的值。在某些情况下，这将为 0；在某些情况下，这将是计算常量表达式初始值设定项 and/or 为变量调用 constexpr 构造函数的结果。如果需要进行任何动态初始化——因为变量的实际初始化不满足常量初始化的条件——那么编译器会发出一段代码，按照定义顺序初始化该翻译单元中的变量。链接器获取所有这些执行动态初始化的代码片段，并按某种顺序（可能交错）组合它们。

没有无限递归，因为a的动态初始化并没有启动b的动态初始化；它只是使用 b 已有的任何值，因为 b 已经动态初始化，或者因为它仍然具有静态初始化的值。 反之亦然。如果 b 在 a 之前动态初始化---并且你不能保证这一点，因为这两个变量是在不同的翻译单元中定义的---那么在 b 的时候动态初始化，a的值为0，所以b变为1；那么动态初始化a的时候，它的值就变成了2，所以你看到的结果就是2 1。但是如果 a 在 b 之前动态初始化，你会看到 1 2.

在只有一个翻译单元的情况下，b 的动态初始化必须发生在 a 之前，因为单个翻译单元内的动态初始化按定义顺序发生（不是声明).这解释了您看到的结果 2 1。但是，2 1 的这个结果仍然不能保证，因为规定允许静态地完成动态初始化。编译器可能会选择静态地为 a 赋予 2 的值，因为如果它是动态初始化的，那将是它的值。如果编译器选择使 a 的初始化完全静态但没有为 b 选择，那么 b 的动态初始化会给它值 3.

如果有两个不同的翻译单元呢？这里标准的措辞不清楚，但我的解释是允许完全静态地将 a 或 b 中的一个或两个初始化为它可以基于任何有效的动态初始化顺序可能具有的任何有效值！如果只有 a 是完全静态初始化的，它可能被静态初始化为 1 或 2，导致 b 在动态初始化期间分别变为 2 或 3。同样，如果只有 b 是完全静态初始化的，它可以静态初始化为 1 或 2，从而导致 a 分别变为 2 或 3。所以：

对于第一个程序，可能的结果是1 2、2 1、2 3或3 2。
对于第二个程序，可能的结果是2 1和2 3。

我认为在实践中，编译器将任一变量的值设为 3 会使一些用户非常生气并且可能会停止这样做。尽管如此，理论上的可能性仍然存在。

避免不可预测的初始化顺序问题的一种方法是禁止非局部静态变量的非常量初始化器。在那种情况下，不可能发生动态初始化，因此所有非局部静态变量的初始化都以明确定义的顺序发生并产生明确定义的值，实际上很可能在编译时进行评估。

Answer 2

我认为您将实际上是多个步骤描述为一个步骤。让我们看看会发生什么，从编译开始。我将专注于 b 的定义； a 的处理方式类似。

正在编译
粗略地说，当编译器看到“int b = a + 1;”时，它会做两件事。首先，它留出足够的内存来存储一个int。该内存位置被注释为"Note to linker: here is the memory location called "b". 其次，编译器会生成类似于下面的注释指令，这些指令将在全局变量初始化时执行。
1) 读取存储在 < 链接器注意事项中的值：在此处插入 a 的地址>。
2) 添加 1.
3）将结果写入b.

正在链接
链接器会看到编译器生成的两个注释。从一开始，它就能够计算 b 的地址，该地址被添加到链接器的已解析符号名称的内部列表中。一旦此列表完成（跨所有翻译单元），链接器通过将 a 的地址放置在请求的位置来处理第二个注释。查找此地址不需要超过链接器列表的标准二进制搜索。（不保证递归。）

执行
当程序运行时，它遵循编译器生成的指令，并由链接器修改。第一个内存是为所有全局变量和静态变量预留的。然后初始化该内存。当 b 初始化时，计算机将读取 a 位置中的任何值，添加 1，并将结果写入 [=10= 位置]. a 是否已经初始化还不一定确定。（另见 static-order-fiasco。）

具有外部说明符的全局变量的循环依赖

Cyclic dependency of global variables with extern specifier

c++

dependencies

extern