stack space 用于其大小在运行时给定的向量？ (C代码)

Question

假设这个 C 代码：

int main(){
    int n;
    scanf("%d\n", &n);

    int a[n];
    int i;
    
    for (i = 0; i<n; i++){
        a[i] = 1;
    }

}

我们有一个向量在堆栈中 space，但直到执行时我们才知道向量的大小（直到用户为变量 n 赋值）。所以我的问题是：space 何时以及如何在堆栈部分为该向量保留？

直到现在我才明白堆栈 space 在编译时保留，堆 space 在运行时保留（使用 malloc 等函数）。但是直到运行时我们才能知道这个向量的大小。

我认为可以做的是从堆栈指针中减去 n 的值，从而扩大该函数的堆栈，使向量适合（我提到的这个减法只会在汇编代码中看到）。

但我一直在观察 /proc/[pid]/maps 内容进行一些测试。并且进程的堆栈 space 没有改变，所以我的想法（在汇编代码中，一条将 n*sizeof(int) 减去堆栈顶部的指令）没有完成。我看过/proc/[pid]/maps的内容，在main函数的最开始和最后。

如果我为 x86 (gcc -m32 -o test.c) 汇编这段代码，你会得到以下汇编代码（以备不时之需）：

.file   "test.c"
    .text
    .section    .rodata
.LC0:
    .string "%d\n"
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    leal    4(%esp), %ecx
    .cfi_def_cfa 1, 0
    andl    $-16, %esp
    pushl   -4(%ecx)
    pushl   %ebp
    .cfi_escape 0x10,0x5,0x2,0x75,0
    movl    %esp, %ebp
    pushl   %esi
    pushl   %ebx
    pushl   %ecx
    .cfi_escape 0xf,0x3,0x75,0x74,0x6
    .cfi_escape 0x10,0x6,0x2,0x75,0x7c
    .cfi_escape 0x10,0x3,0x2,0x75,0x78
    subl    , %esp
    call    __x86.get_pc_thunk.ax
    addl    $_GLOBAL_OFFSET_TABLE_, %eax
    movl    %gs:20, %ecx
    movl    %ecx, -28(%ebp)
    xorl    %ecx, %ecx
    movl    %esp, %edx
    movl    %edx, %esi
    subl    , %esp
    leal    -44(%ebp), %edx
    pushl   %edx
    leal    .LC0@GOTOFF(%eax), %edx
    pushl   %edx
    movl    %eax, %ebx
    call    __isoc99_scanf@PLT
    addl    , %esp
    movl    -44(%ebp), %eax
    leal    -1(%eax), %edx
    movl    %edx, -36(%ebp)
    sall    , %eax
    leal    3(%eax), %edx
    movl    , %eax
    subl    , %eax
    addl    %edx, %eax
    movl    , %ebx
    movl    [=12=], %edx
    divl    %ebx
    imull   , %eax, %eax
    subl    %eax, %esp
    movl    %esp, %eax
    addl    , %eax
    shrl    , %eax
    sall    , %eax
    movl    %eax, -32(%ebp)
    movl    [=12=], -40(%ebp)
    jmp .L2
.L3:
    movl    -32(%ebp), %eax
    movl    -40(%ebp), %edx
    movl    , (%eax,%edx,4)
    addl    , -40(%ebp)
.L2:
    movl    -44(%ebp), %eax
    cmpl    %eax, -40(%ebp)
    jl  .L3
    movl    %esi, %esp
    movl    [=12=], %eax
    movl    -28(%ebp), %ecx
    xorl    %gs:20, %ecx
    je  .L5
    call    __stack_chk_fail_local
.L5:
    leal    -12(%ebp), %esp
    popl    %ecx
    .cfi_restore 1
    .cfi_def_cfa 1, 0
    popl    %ebx
    .cfi_restore 3
    popl    %esi
    .cfi_restore 6
    popl    %ebp
    .cfi_restore 5
    leal    -4(%ecx), %esp
    .cfi_def_cfa 4, 4
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .section    .text.__x86.get_pc_thunk.ax,"axG",@progbits,__x86.get_pc_thunk.ax,comdat
    .globl  __x86.get_pc_thunk.ax
    .hidden __x86.get_pc_thunk.ax
    .type   __x86.get_pc_thunk.ax, @function
__x86.get_pc_thunk.ax:
.LFB1:
    .cfi_startproc
    movl    (%esp), %eax
    ret
    .cfi_endproc
.LFE1:
    .hidden __stack_chk_fail_local
    .ident  "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
    .section    .note.GNU-stack,"",@progbits

Answer 1

这是特定于平台的，但通常，space 在程序启动时保留，并且您有最大堆栈大小。在 Windows 上，默认最大值为 1MB according to Microsoft，您可以使用链接器设置更改它（在 Visual Studio 的项目属性中）。

如果您的程序是多线程的，其他线程的堆栈 space 在它们启动时会被保留。

如果您尝试使用比现在更多的堆栈 space，那么您的程序通常会崩溃，并且它可能也可能不是安全漏洞（即让人们入侵您的程序）- 请参阅“堆栈冲突”。

Answer 2

您可以阅读问题的评论，感谢PeterCordes的帮助，这些评论解决了我的问题。基本上会发生的是，需要数组的堆栈中的 space 将在数组声明的精确时刻在运行时保留（因为此时 n 是已知值）。我们将在汇编代码中有一条指令，即 stackPointer = stackPointer - n * sizeof(int).

Answer 3

首先，您的代码被严重破坏：n 直到 才被设置 它被用来设置 int vector[n]; 的大小。之后更改 n 不会更改数组维度。 Variable-length 数组是 C99 的一个特性，C99 消除了在块中任何其他语句之前声明的需要，使您可以将 scanf 转换为 n 在之前 int vector[n]; 语句在堆栈上为该大小的数组保留 space。

Until now I had understood that the stack space was reserved at compile time and the heap space at runtime

总堆栈区域在程序启动时保留。根据 OS，为堆栈增长保留的 space 数量由 OS 设置选择， 而不是 可执行文件中的元数据。（例如，在 Linux 中，通过初始线程堆栈的 ulimit -s 设置，pthreads 选择为每个线程堆栈分配多少 space。）

堆栈帧的布局在编译时是固定的（局部变量相对于彼此的位置），但每次函数运行时都会发生实际分配。这就是函数可以递归和 re-entrant 的方式！这也是使堆栈成为堆栈的原因：在当前函数的末尾创建 space，在返回之前立即释放它。（Variable-length 数组和 alloca 的大小为 runtime-variable，因此编译器通常会将它们放在其他局部变量之下。）

只有静态存储是在编译时真正保留/分配的。（全局变量和 static 变量。）

(ISO C 不需要一个实际的堆栈，只是 automatic-storage 变量生命周期的 LIFO 语义。一些 ISA 上的一些实现基本上是动态分配 space 用于堆栈帧，就像使用 malloc，而不是使用堆栈。）

这排除了在编译时为局部变量静态分配 space。在大多数 C 实现中，它们在 x86-64 sub rsp, 24 或其他任何东西的堆栈上。当然locals的layout相对于彼此是在编译时固定的，在large allocation里面，所以编译器不需要做存储指向对象指针的代码，它们只是发出使用寻址模式的指令，如 [rsp + 4].

So my question is: when and how space is reserved for that vector int the stack section?

逻辑上在C抽象机中：当到达int vector[n]语句时，在本次函数的执行中。相比之下，存在fixed-size个对象在封闭范围的顶部。

因此，您的示例已严重损坏。在分配 VLA 之后之前，您保持 n 未初始化！！编译您的代码并启用警告以捕获此类问题。 scanf 应该在 int vector[n] 之前。（此外，不要将普通数组称为“向量”，这在了解 C++ 的人看来是错误的。）

But in this case, the C and x86 rules that mention that local variables should be placed in the order of their declaration would not be respected.

没有这样的规则。在 ISO C 中，甚至写入 vector < &n 并比较单独对象的地址都是未定义的行为。（C++ 允许使用 std::less；C 没有等效的）。

C 编译器可以根据自己的选择布置堆栈帧，例如将小对象组合在一起以避免浪费 space 填充以对齐较大的 more-aligned 对象。

x86 asm 根本没有变量声明。作为程序员（或 C 编译器），您可以编写移动堆栈指针的指令，并使用内存寻址模式来访问您要访问的内存。通常，您会实现“变量”的high-level概念。

例如，让我们制作一个将 n 作为函数 arg 的函数版本，而不是使用 scanf。

#include <stdio.h> void use_mem(void*); // compiler can't optimize away calls to this unknown function void foo(int size) { int n = size; // uninitialized was UB int array[n]; int i; i = 5; // optimizes away, i is kept in a register //scanf("%d\n", &n); // read some different size later??? makes no sense for (i = 0; i<n; i++){ array[i] = 1; } use_mem(array); // make the stores not be dead }

On Godbolt with GCC10.1 -O2 -Wall，对于 x86-64 系统 V：

foo(int): push rbp movsx rax, edi # sign-extend n lea rax, [15+rax*4] # round size up and rax, -16 # to a multiple of 16, to main stack alignment mov rbp, rsp # finish setting up a frame pointer sub rsp, rax # allocate space for array[] mov r8, rsp # keep a pointer to it test edi, edi # if ( n==0 ) skip the loop jle .L2 mov edi, edi # zero-extend n mov rax, r8 # int *p = array lea rdx, [r8+rdi*4] # endp = &array[(unsigned)n] .L3: # do{ mov DWORD PTR [rax], 1 # *p = 1 add rax, 4 # pointer increment cmp rax, rdx jne .L3 # }while(p != endp) .L2: mov rdi, r8 # pass a pointer to the VLA call use_mem(void*) leave # tear down frame pointer / stack frame ret

注意当call use_mem运行时，array[n] space在堆栈指针之上，即“已分配”。

如果 use_mem 回调到该函数，将在堆栈上分配另一个具有自己大小的 VLA 实例。

leave 指令只是 mov rsp, rbp / pop rbp，所以它将堆栈指针设置为指向分配的 space、de分配它。

stack space 用于其大小在运行时给定的向量？ (C代码)

stack space for a vector that its size is given at runtime? (C code)

c

x86

assembly

variable-length-array

automatic-storage