编译器如何将数百个变量存储在几个寄存器中？

How do compilers store hundreds of variables in only a few registers?

假设您有一个只有 4 个寄存器 A、B、C 和 D 的虚拟机。编译器如何仅用有限的 space 存储这么多变量？

是否有多种方法可以做到这一点，或者是否有一种可靠的方法可以做到这一点？这是什么花哨的科学术语，也被认为是一个复杂的问题？

谢谢

不适合寄存器的东西（大多数东西）存储在内存中，只有在需要对其进行操作时才移入寄存器。您可能想阅读关于 register allocation 的维基百科文章，这正是您所询问的内容的名称。

我建议你阅读 Programming Language Pragmatics or the Dragon Books, in particular the chapters on register allocation。

简而言之，有很多方法可以处理这种情况。通常，编译器会构建一个 intermediate representation which can be an abstract machine with an infinite number of registers or an SSA 形式。当为特定目标生成代码时 hardware/OS，这些抽象寄存器会根据抽象寄存器的使用频率或寿命（即您的原始变量）等标准分配给实际寄存器或堆栈位置。

根据所选的中间表示，有不同的方法（参见示例 here or here). The problem can be difficult if you are striving for an optimal solution (i.e. keep as many variables for as long as possible in actual registers without spilling them onto the stack), but there are simpler approaches like "linear scan register allocation" when time is critical e.g. in just-in-time compilations。

如果您想深入研究代码，或许可以查看 LLVM 基础结构和 their register allocation and this 介绍。

这是寄存器分配的题目。基本上所做的是编译器为每个变量计算同时使用的其他变量。然后，编译器将构建一个干扰图，其中程序中使用的每个变量都有一个节点，同时所有节点之间都有一条边。然后这就变成了图形着色问题，其中颜色对应于机器上可用的寄存器。

如您所知，图形着色是一个 NP 完全问题，因此编译器实现了一种简单但非常有效的启发式算法。基本上，他们在图中找到边数少于 k 的最高度节点，其中 k 是机器上的寄存器数量。然后我们移除这个节点及其所有边，并递归地为剩余的图着色。如果不存在这样的节点，我们将采用度数最高的节点，并将其溢出，这意味着我们将其存储在堆栈中，然后在删除该节点的情况下重试着色过程。

编译器如何将数百个变量存储在几个寄存器中？

How do compilers store hundreds of variables in only a few registers?

compiler-construction

vm-implementation