无程序在ASM中实现递归

Question

我正在尝试使用没有过程的类似 ASM 的简化语言来实现函数和递归。只有简单的 jumpz、jump、push、pop、add、mul 类型命令。

命令如下：
（所有变量和文字都是整数）

set（设置一个已经存在的变量的值或声明并初始化一个新变量）例如（设置 x 3）
push（将值压入堆栈。可以是变量或整数）例如（按 3）或（按 x）
pop（将堆栈弹出到一个变量中）例如（弹出 x）
add（将第二个参数添加到第一个参数）例如(加 x 1) 或 (加 x y)
mul（与加法相同，但用于乘法）
跳转（跳转到特定的代码行）例如(jump 3) 会跳转到第 3 行或者 (jump x) 会跳转到 # 等于 x
jumpz（如果第二个参数为零则跳转到行号）例如(jumpz 3 x) 或 (jumpz z x)

变量'IP'是程序计数器，等于当前正在执行的代码行的行号。

在这种语言中，函数是程序底部的代码块，通过从堆栈中弹出一个值并跳转到该值来终止。（使用堆栈作为调用堆栈）然后只需将指令指针压入堆栈，然后跳转到函数的开头，就可以在程序的任何其他地方调用函数。

这适用于非递归函数。

如何修改它以处理递归？

我读到用堆栈实现递归是将参数和局部变量压入堆栈（在这种较低级别的情况下，我认为还有指令指针）

我不能做像 x = f(n) 这样的事情。为此，我需要一些变量 y（也在 f 的主体中使用），将 y 设置为 n，调用 f 将其 "return value" 分配给 y，然后将控制权跳回到调用 f 的位置从，然后我们设置 x 等于 y。

(一个对数字求平方的函数，其定义从第 36 行开始)

1 - set y 3
2 - set returnLine IP
3 - add returnLine 2
4 - push returnLine
5 - jump 36
6 - set x y
...
36 - mul y 2
37 - pop returnLine
38 - jump returnLine

这似乎不适合递归。参数和中间值需要进入堆栈，我认为同一地址的堆栈上的多个实例将由递归调用产生，这很好。

Answer 1

下一个代码在 "John Smith Assembly" 中递归地对数字 "base" 进行 "exponent" 次方的运算：

1 - set base 2            ;RAISE 2 TO ...
2 - set exponent 4        ;... EXPONENT 4 (2^4=16).
3 - set result 1          ;MUST BE 1 IN ORDER TO MULTIPLY.
4 - set returnLine IP     ;IP = 4.
5 - add returnLine 4      ;RETURNLINE = 4+4.
6 - push returnLine       ;PUSH 8.
7 - jump 36               ;CALL FUNCTION.
.
.
.
;POWER FUNCTION.
36 - jumpz 43 exponent    ;FINISH IF EXPONENT IS ZERO.

37 - mul result base      ;RESULT = ( RESULT * BASE ).
38 - add exponent -1      ;RECURSIVE CONTROL VARIABLE.
39 - set returnLine IP    ;IP = 39.
40 - add returnLine 4     ;RETURN LINE = 39+4.
41 - push returnLine      ;PUSH 43.
42 - jump 36              ;RECURSIVE CALL.

43 - pop returnLine
44 - jump returnLine
;POWER END.

为了测试它，让我们运行手动测试它：

 LINE | BASE EXPONENT RESULT RETURNLINE STACK
------|---------------------------------------
   1  |   2
   2  |         4
   3  |                  1
   4  |                           4
   5  |                           8
   6  |                                   8
   7  |
  36  |
  37  |                  2
  38  |         3
  39  |                          39
  40  |                          43
  41  |                                  43(1)
  42  |
  36  |
  37  |                  4
  38  |         2
  39  |                          39
  40  |                          43
  41  |                                  43(2)
  42  |
  36  |
  37  |                  8
  38  |         1
  39  |                         39
  40  |                         43
  41  |                                  43(3)
  42  |
  36  |
  37  |                 16
  38  |         0
  39  |                         39
  40  |                         43
  41  |                                  43(4)
  42  |
  36  |
  43  |                         43(4)
  44  |
  43  |                         43(3)
  44  |
  43  |                         43(2)
  44  |
  43  |                         43(1)
  44  |
  43  |                          8
  44  |
   8  |

编辑：函数的参数现在在堆栈上（不是运行手动）：

1 - set base 2            ;RAISE 2 TO ...
2 - set exponent 4        ;... EXPONENT 4 (2^4=16).
3 - set result 1          ;MUST BE 1 IN ORDER TO MULTIPLY.
4 - set returnLine IP     ;IP = 4.
5 - add returnLine 7      ;RETURNLINE = 4+7.
6 - push returnLine       ;PUSH 11.
7 - push base             ;FIRST PARAMETER.
8 - push result           ;SECOND PARAMETER.
9 - push exponent         ;THIRD PARAMETER.
10 - jump 36              ;FUNCTION CALL.
...
;POWER FUNCTION.
36 - pop exponent         ;THIRD PARAMETER.
37 - pop result           ;SECOND PARAMETER.
38 - pop base             ;FIRST PARAMETER.
39 - jumpz 49 exponent    ;FINISH IF EXPONENT IS ZERO.

40 - mul result base      ;RESULT = ( RESULT * BASE ).
41 - add exponent -1      ;RECURSIVE CONTROL VARIABLE.
42 - set returnLine IP    ;IP = 42.
43 - add returnLine 7     ;RETURN LINE = 42+7.
44 - push returnLine      ;PUSH 49.
45 - push base
46 - push result
47 - push exponent
48 - jump 36              ;RECURSIVE CALL.


49 - pop returnLine
50 - jump returnLine
;POWER END.

Answer 2

您的 asm 确实提供了足够的工具来实现通常的过程调用/return 序列。您可以压入一个 return 地址并作为 call 跳转，弹出一个 return 地址（进入临时位置）并作为 ret 间接跳转到它。我们可以只制作 call 和 ret 宏。（除了在宏中生成正确的 return 地址很棘手；我们可能需要一个标签 (push ret_addr)，或者类似 set tmp, IP / add tmp, 4 / push tmp / jump target_function)。简而言之，这是可能的，我们应该将其包裹在一些语法糖中，这样我们在研究递归时就不会陷入困境。

使用正确的语法糖，您可以在汇编中实现 Fibonacci(n)，实际上 assemble 适用于 x86 和您的玩具机。

您正在考虑修改静态（全局）变量的函数。递归需要局部变量，因此对函数的每个嵌套调用都有其自己的局部变量副本。您的机器没有寄存器，而是（显然是无限的）命名静态变量（如 x 和 y）。如果您想像 MIPS 或 x86 一样对其进行编程，并复制现有的调用约定，只需使用一些命名变量，例如 eax、ebx、...或 r0 .. r31寄存器架构使用寄存器的方式。

然后你像在普通调用约定中一样实现递归，其中调用者或被调用者使用 push / pop 到 save/restore 堆栈上的寄存器所以它可以重复使用。函数 return 值进入寄存器。函数参数应该放在寄存器中。一个丑陋的替代方案是将它们推送到 after return 地址（创建一个 caller-cleans-the-args-from-the-stack 调用约定），因为你不没有堆栈相关的寻址模式来像 x86 那样访问它们（在堆栈的 return 地址之上）。或者您可以像大多数 RISC call 指令一样在 link register 中传递 return 地址（对于分支与 link，通常称为 bl 或类似指令），而不是像 x86 的 call 那样推动它。（所以非叶被调用者必须在进行另一个调用之前将传入的 lr 自己推入堆栈）

一个（愚蠢而缓慢的）天真实现的递归 Fibonacci 可能会做类似的事情：

int Fib(int n) {
    if(n<=1) return n;          // Fib(0) = 0;  Fib(1) = 1
    return Fib(n-1) + Fib(n-2);
}

## valid implementation in your toy language *and* x86 (AMD64 System V calling convention)

### Convenience macros for the toy asm implementation
# pretend that the call implementation has some way to make each return_address label unique so you can use it multiple times.
# i.e. just pretend that pushing a return address and jumping is a solved problem, however you want to solve it.
%define call(target)   push return_address; jump target; return_address:
%define ret            pop rettmp; jump rettmp    # dedicate a whole variable just for ret, because we can
# As the first thing in your program,  set eax, 0  / set ebx, 0 / ...

global Fib
Fib:
    # input: n in edi.
    # output: return value in eax
      # if (n<=1) return n;  // the asm implementation of this part isn't interesting or relevant.  We know it's possible with some adds and jumps, so just pseudocode / handwave it:
    ... set eax, edi and ret  if edi <= 1 ... # (not shown because not interesting)
    add     edi, -1
    push    edi        # save n-1 for use after the recursive call
    call    Fib        # eax = Fib(n-1)
    pop     edi        # restore edi to *our* n-1
    push    eax        # save the Fib(n-1) result across the call
    add     edi, -1
    call    Fib        # eax = Fib(n-2)
    pop     edi        # use edi as a scratch register to hold Fib(n-1) that we saved earlier
    add     eax, edi   # eax = return value = Fib(n-1) + Fib(n-2)
    ret

在对 Fib(n-1) 的递归调用期间（以 edi 中的 n-1 作为第一个参数），n-1 arg 也保存在堆栈中，成为后来恢复了。 因此每个函数的堆栈帧包含需要在递归调用中存活的状态，以及一个 return 地址。 这正是具有堆栈的机器上递归的全部内容。

Jose 的例子也没有证明这一点，IMO，因为没有状态需要在 pow 的调用中存活下来。所以它最终只是推送一个 return 地址和 args，然后弹出 args，只构建一些 return 地址。然后在最后，跟随 return 地址链。它可以扩展为在每个嵌套调用中保存本地状态，实际上并没有说明它。

我的实现与 gcc 为 x86-64 编译相同 C 函数的方式有点不同（使用相同的调用约定，即在 edi 中使用第一个 arg，在 eax 中使用 ret 值）。 -O1 的 gcc6.1 保持简单，实际上进行了两次递归调用，如您所见 on the Godbolt compiler explorer。（-O2 尤其是 -O3 进行了一些激进的转换）。 gcc saves/restores rbx 贯穿整个函数，并在 ebx 中保留 n，因此它在 Fib(n-1) 调用后可用。（并在 ebx 中保留 Fib(n-1) 以在第二次调用中存活）。 System V 调用约定将 rbx 指定为调用保留寄存器，但将 rbi 指定为调用破坏（并用于参数传递）。

显然你可以非递归地更快地实现 Fib(n) much，具有 O(n) 时间复杂度和 O(1) space 复杂度，而不是O(Fib(n)) 时间和 space（堆栈使用）复杂度。这是一个糟糕的例子，但它是微不足道的。

Answer 3

Margaret 的 pastebin 在我的这种语言的解释器中稍微修改为运行：（无限循环问题，可能是由于我的转录错误）

set n 3
push n
set initialCallAddress IP
add initialCallAddress 4
push initialCallAddress
jump fact
set finalValue 0
pop finalValue
print finalValue
jump 100
:fact
set rip 0
pop rip
pop n
push rip
set temp n
add n -1
jumpz end n
push n
set link IP
add link 4
push link
jump fact
pop n
mul temp n
:end
pop rip
push temp
jump rip

彼得的斐波那契计算器的成功转录：

        String[] x = new String[] {
            //n is our input, which term of the sequence we want to calculate
                "set n 5",
            //temp variable for use throughout the program
                "set temp 0",
            //call fib
                "set temp IP",
                "add temp 4",
                "push temp",
                "jump fib",
            //program is finished, prints return value and jumps to end
                "print returnValue",
                "jump end",
            //the fib function, which gets called recursively
                ":fib",
            //if this is the base case, then we assert that f(0) = f(1) = 1 and return from the call
                "jumple base n 1",
                "jump notBase",
                ":base",
                "set returnValue n",
                "pop temp",
                "jump temp",
                ":notBase",
            //we want to calculate f(n-1) and f(n-2)
            //this is where we calculate f(n-1)
                "add n -1",
                "push n",
                "set temp IP",
                "add temp 4",
                "push temp",
                "jump fib",
            //return from the call that calculated f(n-1)
                "pop n",
                "push returnValue",
            //now we calculate f(n-2)
                "add n -1",
                "set temp IP",
                "add temp 4",
                "push temp",
                "jump fib",
            //return from call that calculated f(n-2)
                "pop n",
                "add returnValue n",
            //this is where the fib function ultimately ends and returns to caller
                "pop temp",
                "jump temp",
            //end label
                ":end"
        };

无程序在ASM中实现递归

Implement recursion in ASM without procedures

recursion

assembly

callstack