没有存档文件的静态链接如何工作？

Question

我有两个文件

main.c

void swap();

int buf[2] = {1, 2}; 

int main() 
{
    swap();
    return 0;
}

swap.c

extern int buf[];

int* bufp0 = &buf[0]; /* .data */
int* bufp1; /* .bss */

void swap()
{
    int temp;
    
    bufp1 = &buf[1];
    temp = *bufp0;
    *bufp0 = *bufp1;
    *bufp1 = temp;
}

这里有 2 节摘自一本书

During this scan, the linker maintains a set E of relocatable object files that 
will be merged to form the executable, a set U of unresolved symbols 
(i.e., symbols referred to, but not yet defined), and a set D of symbols that 
have been defined in previous input files.
Initially, E, U , and D are empty.

For each input file f on the command line, the linker determines if f is an
object file or an archive. If f is an object file, the linker adds f to E, updates
U and D to reflect the symbol definitions and references in f , and proceeds
to the next input file.

If f is an archive, the linker attempts to match the unresolved symbols in U
against the symbols defined by the members of the archive. If some archive
member, m, defines a symbol that resolves a reference in U , then m is added
to E, and the linker updates U and D to reflect the symbol definitions and
references in m. This process iterates over the member object files in the
archive until a fixed point is reached where U and D no longer change. At
this point, any member object files not contained in E are simply discarded
and the linker proceeds to the next input file.

If U is nonempty when the linker finishes scanning the input files on the
command line, it prints an error and terminates. Otherwise, it merges and
relocates the object files in E to build the output executable file.

The general rule for libraries is to place them at the end of the command
line. If the members of the different libraries are independent, in that no member
references a symbol defined by another member, then the libraries can be placed
at the end of the command line in any order.

If, on the other hand, the libraries are not independent, then they must be
ordered so that for each symbol s that is referenced externally by a member of an
archive, at least one definition of s follows a reference to s on the command line.

For example, suppose foo.c calls functions in libx.a and libz.a that call func-
tions in liby.a. Then libx.a and libz.a must precede liby.a on the command
line:

unix> gcc foo.c libx.a libz.a liby.a

我运行以下命令静态link两个目标文件（不创建任何存档文件）

gcc -static -o main.o main.c swap.c

我预计上述命令会失败，因为 main.c 和 swap.c 都有相互定义的引用。但是出乎我的意料，它成功了。我希望只有在命令行末尾再次通过 main.c 才能成功。

在这种情况下，linker 如何解析两个文件中的引用？ linker 尝试静态 link 多个目标文件而不是归档文件时，它的工作方式是否有所不同？我的猜测是 linker 绕回 main.c 以解析 swap.c.

中的引用 buf

Answer 1

您的情况下的 gcc 命令（不带 -c）会生成一个 可执行文件 图像。该命令将命令行上的每个“.c”文件编译为“.o”表示。然后它从命令行调用指向所有 .o 文件的链接器 (ld)。链接器解析引用并生成一个名为 ... main.o 的可执行文件（在您的情况下）（-o 命名可执行文件）。你可以运行它。

静态存档库只是单独编译的 .o 文件的集合。链接器检查存档中的所有符号以解析符号。您可以使用“-c”限定符预编译您的 .c 文件，生成 .o 文件，然后在命令行上使用它们，或者创建它们的归档文件并改用存档。

Answer 2

通常，链接器的默认行为是包含提供给它的每个目标模块文件中的所有内容，并仅从库中获取定义链接器在处理库时知道的引用的目标模块。

因此，当链接器处理 main.o 时，它会准备其中的所有内容以进入它正在构建的输出文件。这包括记住（无论是在内存中还是链接器临时维护的辅助文件）由 main.o 定义的所有符号以及 main.o 具有未解析引用的所有符号。当链接器处理 swap.o 时，它会将 swap.o 中的所有内容添加到它正在构建的输出文件中。此外，对于 main.o 中满足 swap.o 定义的任何引用，它解析这些引用。并且，对于 swap.o 中满足 main.o 定义的任何引用，它会解析这些引用。

正如您引用的文字所说，对于目标模块文件：

“(...) the linker adds f to E, updates U and D to reflect the symbol definitions and references in f, and proceeds to the next input file.”

对于链接器添加到可执行文件的每个目标模块，该步骤实际上是相同的，无论目标模块来自目标模块文件还是来自库文件。不同之处在于，如果目标模块在文件中，则链接器无条件地将其添加到可执行文件中，但是，如果目标模块在库中，则仅当它定义了一个符号时，链接器才将其添加到可执行文件中。目前正在寻找。

没有存档文件的静态链接如何工作？

How does static linking without an archive file work?

c

linker

operating-system

reference

static-linking