当只需要 8 个字节时将堆栈递减 24？

Question

我有C代码：

long fib(long n) {
  if (n < 2) return 1;
  return fib(n-1) + fib(n-2);
}

int main(int argc, char** argv) {
    return 0;
}

我由运行 gcc -O0 -fno-optimize-sibling-calls -S file.c 编译生成未优化的汇编代码：

    .file   "long.c"
    .text
    .globl  fib
    .type   fib, @function
fib:
.LFB5:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    pushq   %rbx
    subq    , %rsp
    .cfi_offset 3, -24
    movq    %rdi, -24(%rbp)
    cmpq    , -24(%rbp)
    jg  .L2
    movl    , %eax
    jmp .L3
.L2:
    movq    -24(%rbp), %rax
    subq    , %rax
    movq    %rax, %rdi
    call    fib
    movq    %rax, %rbx
    movq    -24(%rbp), %rax
    subq    , %rax
    movq    %rax, %rdi
    call    fib
    addq    %rbx, %rax
.L3:
    addq    , %rsp
    popq    %rbx
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE5:
    .size   fib, .-fib
    .globl  main
    .type   main, @function
main:
.LFB6:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movl    %edi, -4(%rbp)
    movq    %rsi, -16(%rbp)
    movl    [=11=], %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE6:
    .size   main, .-main
    .ident  "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
    .section    .note.GNU-stack,"",@progbits

我的问题是：

为什么我们要将栈指针减24，subq , %rsp？正如我所看到的，我们只存储一个元素，第一个参数 n in %rdi，在最初的两次推送之后在堆栈上。那么，为什么我们不将堆栈指针减 8，然后将 n 移动到 -8(%rbp)？所以

subq    , %rsp
movq    %rdi, -8(%rbp)

Answer 1

GCC 没有完全优化 -O0，甚至它的堆栈使用也没有。（这可以通过使堆栈的某些使用对人类更透明来帮助调试。例如，对象 a、b 和 c 可以共享一个堆栈位置，如果它们活动生命周期（由程序中的使用定义，而不是由 C 标准中的生命周期模型定义）-O3，但可能在堆栈中单独保留位置 -O0，这使得它更容易一个人看到 a、b 和 c 在汇编代码中的使用位置。浪费的 16 个字节可能是这个的副作用，因为那些 spaces可能出于某些目的而保留，而这个小函数恰好没有使用，例如 space 以在需要时保存某些寄存器。）

Changing optimization to -O3 results in GCC subtracting only eight from the stack pointer.

当只需要 8 个字节时将堆栈递减 24？

Decrementing stack by 24 when only 8 bytes are needed?

c

assembly

gcc

callstack