如果未初始化，为什么局部变量在 C 中具有未确定的值？

Question

在 C - Linux OS 中，当一个函数被调用时，Assembly 的尾声部分会创建一个堆栈帧，并且局部变量引用基指针。我的问题是，当我们在没有初始化的情况下打印变量时，是什么使变量保持未确定的值。我的理论是，当我们使用该变量时，OS 带来对应于局部变量地址的 page 并且 page 中的地址可能具有一些值，使值局部变量。对吗？

Answer 1

我们来看一个简单程序的反汇编：

#include <stdio.h>

int main() {
    unsigned int i;
    unsigned int j = 1;
    printf("%u\n", j);
    printf("%u\n", i);
}

GCC-11.1 默认优化的反汇编是：

    .file   "char.c"
    .text
    .section    .rodata
.LC0:
    .string "%u\n"
    .text
    .globl  main
    .type   main, @function
/*So, till here is meta data and other stuff. We're interested in what's bottom*/

main:
.LFB0:
    .cfi_startproc
    endbr64
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    , %rsp
    movl    , -8(%rbp)
    movl    -8(%rbp), %eax /*See, it wrote 1 into -8(%rbp), which
represents the variable j, but didn't assign anything anything to
 -4(%rbp), which represents the variable i*/
    movl    %eax, %esi
    leaq    .LC0(%rip), %rax
    movq    %rax, %rdi
    movl    [=11=], %eax
    call    printf@PLT
    movl    -4(%rbp), %eax /* Now we load -4(%rbp), which is i, into
 %eax for printing. Whatever is at -4(%rbp) gets printed. So, it's
 undetermined */
    movl    %eax, %esi
    leaq    .LC0(%rip), %rax
    movq    %rax, %rdi
    movl    [=11=], %eax
    call    printf@PLT
    movl    [=11=], %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Ubuntu 11.1.0-3ubuntu1) 11.1.0"
    .section    .note.GNU-stack,"",@progbits
    .section    .note.gnu.property,"a"
    .align 8
    .long   1f - 0f
    .long   4f - 1f
    .long   5
0:
    .string "GNU"
1:
    .align 8
    .long   0xc0000002
    .long   3f - 2f
2:
    .long   0x3
3:
    .align 8
4:

阅读反汇编中的注释以获得解释。

显然，在某些情况下，编译器可能甚至懒得将未初始化的变量加载到寄存器中（不是在这种情况下，可能取决于编译器、优化和情况），而是只使用寄存器中的任何内容。曾经看到有人这么说，我没有查过ISO标准，也没有验证过。你是如何开始在标准中找到这样的东西的？很大。

Answer 2

考虑编译程序正确初始化对象的编译器：

int x = 3;
printf("%d\n", x);
int y = 4+x*7;
printf("%d\n", y);

这可能导致汇编代码：

Store 3 in X.                   // "X" refers to the stack location assigned for x.
Load address of "%d\n" into R0. // R0 is the register used for passing the first argument.
Load from X into R1.            // R1 is the register for the second argument.
Call printf.
Load 4 into R1.                 // Start the 4 of 4+x*7.
Load from X into R2             // Get x to calculate with it.
Multiply R2 by 7.               // Make x*7.
Add R2 to R1.                   // Finish 4+x*7.
Load address of "%d\n" into R0.
Call printf.

这是一个工作程序。现在假设我们不初始化 x 而是 int x; 。由于 x 没有初始化，规则说它没有确定的值。这意味着允许编译器省略所有获取 x 值的指令。因此，让我们使用有效的汇编代码并删除所有获取 x:

值的指令

Load address of "%d\n" into R0. // R0 is the register used for passing the first argument.
Call printf.
Load 4 into R1.                 // Start the 4 of 4+x*7.
Multiply R2 by 7.               // Make x*7.
Add R2 to R1.                   // Finish 4+x*7.
Load address of "%d\n" into R0.
Call printf.

在这个程序中，第一个 printf 打印 R1 中的任何内容，因为 x 的值从未加载到 R1 中。 x*7 的计算使用了 R2 中的任何内容，因为 x 的值从未加载到 R2 中。所以这个程序可能会在第一个 printf 中打印“37”，因为 R1 中恰好有一个 37，但它可能会在第二个 printf 中打印“4” ]，因为在R2中正好有一个0。所以这个程序的输出“看起来像”x 某一时刻的值为 37，另一时刻的值为 0。该程序的行为就好像 x 没有任何固定值。

这是一个非常简化的例子。实际上，当编译器在优化期间删除代码时，它会删除更多。例如，如果它知道 x 没有被初始化，它可能不会只删除 x 的负载，还会删除乘以 7 的负载。但是，这个例子用于演示原理：当有一个未初始化的值，编译器可以从根本上改变生成的代码。

如果未初始化，为什么局部变量在 C 中具有未确定的值？

Why local variables have undetermined values in C if not initialized?

c

memory-management

linux-kernel