C 中的变量阴影 - 为什么编译器会感到困惑?

Variables shadowing in C - why compiler got confused?

我有这段代码可以解析 JSON。该结构具有键、值和指向下一个结构的指针。由于嵌套,val指针有时会指向jss结构

下面的代码

struct jss {
    uint8_t type;
    char *key;
    char *val;
    struct jss *next;
};

void my_f() {
    ...
    struct jss *js = (struct jss *)malloc(sizeof(struct jss));
    ...
    while(js) {
        struct jss *js1 = (struct jss *)js->val;
        ...
    }
}

编译并运行良好,并具有此程序集:

struct jss *js = (struct jss *)malloc(sizeof(struct jss));
 4ea:   bf 20 00 00 00          mov    [=11=]x20,%edi
 4ef:   e8 00 00 00 00          callq  4f4 <Init+0x407>
 4f4:   48 89 45 e8             mov    %rax,-0x18(%rbp)
 
    ...
    
        char *t, *f, *h;
        struct jss *js1 = ((struct jss *)(js->val));
 522:   48 8b 45 e8             mov    -0x18(%rbp),%rax
 526:   48 8b 40 10             mov    0x10(%rax),%rax
 52a:   48 89 45 b0             mov    %rax,-0x50(%rbp)

我们看到rbp-0x18,其中有js结构的addr被移动到rax,rax再添加0x10指向 到 js->val 地址,结果存储在保存 js1 的 rbp-0x50 中。到目前为止,一切顺利!

但是如果我把代码改成这样(js1被js替换):

struct jss {
    uint8_t type;
    char *key;
    char *val;
    struct jss *next;
};

void my_f() {
    ...
    struct jss *js = (struct jss *)malloc(sizeof(struct jss));
    ...
    while(js) {
        char *t, *f, *h;
        struct jss *js = (struct jss *)js->val;
        ...
    }
}

我有这个程序集:

struct jss *js = (struct jss *)malloc(sizeof(struct jss));
     4ea:   bf 20 00 00 00          mov    [=13=]x20,%edi
     4ef:   e8 00 00 00 00          callq  4f4 <Init+0x407>
     4f4:   48 89 45 e8             mov    %rax,-0x18(%rbp)

    ...

            char *t, *f, *h;
            struct jss *js = ((struct jss *)(js->val));
     522:   48 8b 45 c8             mov    -0x38(%rbp),%rax
     526:   48 8b 40 10             mov    0x10(%rax),%rax
     52a:   48 89 45 c8             mov    %rax,-0x38(%rbp)

编译正常但有段错误: 而不是将js结构(rbp-0x18)的地址加载到rax中,加载的地址是 我创建的新结构......那么它为什么会出现段错误也就不足为奇了。

问题是第二个代码是非法的。我知道可变阴影,这确实是我的意图。为什么编译器会混淆(我使用 gcc)?

考虑你的这行代码:

struct jss *js = (struct jss *)js->val;
//          ^                  ^ 
//          |                  |
//          this js  and this js are the same

您声明 js,然后取消引用 js。第二个 js 与声明的变量相同,当然没有初始化,因此出现段错误。

如果你有

struct jss *js1 = (struct jss *)js->val;

那么js指的就是外层作用域声明的js,就是你要的

这与这个更简单的情况完全相同:

int foo = 3;
...
{
   int foo = foo;
   ... // you expect foo to be three here, but actually
       // you're just assigning the unininitialized foo to itself
}

顺便说一句 clang issues a very explicit warning in this situation 但 gcc 没有。

根据 C 标准(6.2.1 标识符的范围)

7 Structure, union, and enumeration tags have scope that begins just after the appearance of the tag in a type specifier that declares the tag. Each enumeration constant has scope that begins just after the appearance of its defining enumerator in an enumerator list. Any other identifier has scope that begins just after the completion of its declarator.

所以在这个 while 语句中

while(js) {
    char *t, *f, *h;
    struct jss *js = (struct jss *)js->val;
    ...
}

在初始化程序中声明了引用自身的标识符 js。也就是说,在初始化表达式中,使用了声明标识符的不确定值 js,它隐藏了具有相同标识符的对象。在 while 语句之前在外部范围内声明。