为什么 gcc 会为使用不同形式的整数文字的程序生成不同的编译二进制文件?

Why does gcc produce different compiled binaries for programs that use different forms of integer literals?

我想知道有什么区别:

int a = 0b00000100;
int a = 0x04;
int a = 4;

使用 gcc 编译时。

我似乎在使用似乎相同的数字进行编译时得到了不同的二进制文件,只是符号不同。但是,当我在其上 运行 objdump 时,似乎没有任何区别。有人能告诉我这是怎么回事吗?

这是我的输出:

marshall@dont.even.try.to.h4ck.me:[~]: cat testbin.c && echo && cat testbin2.c
#include "stdio.h"
int main () {
  int a = 0b00000100;
  int b = 0x05;
  int c = 6;
  printf("%d - %d - %d\n", a, b, c);
  return (0);
}

#include "stdio.h"
int main () {
  int a = 4;
  int b = 5;
  int c = 6;
  printf("%d - %d - %d\n", a, b, c);
  return (0);
}
marshall@dont.even.try.to.h4ck.me:[~]: gcc testbin.c -o testbin
marshall@dont.even.try.to.h4ck.me:[~]: gcc testbin2.c -o testbin2
marshall@dont.even.try.to.h4ck.me:[~]: md5sum testbin testbin2
fd6aaa31bdf685ea9444e1edc209565e  testbin
3a3fc241bfc2917ee29999b5befecd2a  testbin2
marshall@dont.even.try.to.h4ck.me:[~]: objdump -d testbin > testbin.obj && objdump -d testbin2 > testbin2.obj
marshall@dont.even.try.to.h4ck.me:[~]: diff testbin.obj testbin2.obj
2c2
< testbin:     file format elf64-x86-64
---
> testbin2:     file format elf64-x86-64
marshall@dont.even.try.to.h4ck.me:[~]: gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 6.3.0-18' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 6.3.0 20170516 (Debian 6.3.0-18)
marshall@dont.even.try.to.h4ck.me:[~]:

请注意,可执行文件是不同的,它们具有不同的哈希值,但 objdump -d 没有显示任何不同。

我认为这个问题与整数格式无关,而与文件名有关。

我编译了以下程序两次,第一次使用文件名 FIRST_PROG.c 和可执行文件名 COMPILED_1,第二次使用文件名 SECOND_PROC.c 和可执行文件名 COMPILED_2 使用gcc 未设置其他标志:

int main() {
    return 0;
}

如果您 hd 生成的可执行文件的内容,在某个偏移量处您会看到:

00001720  66 72 61 6d 65 5f 64 75  6d 6d 79 5f 69 6e 69 74  |frame_dummy_init|
00001730  5f 61 72 72 61 79 5f 65  6e 74 72 79 00 46 49 52  |_array_entry.FIR|
00001740  53 54 5f 50 52 4f 47 2e  63 00 5f 5f 46 52 41 4d  |ST_PROG.c.__FRAM|

请注意,源文件的名称 FIRST_PROG.c 已嵌入到生成的可执行文件中。查看第二个文件中的相同位置显示:

00001720  66 72 61 6d 65 5f 64 75  6d 6d 79 5f 69 6e 69 74  |frame_dummy_init|
00001730  5f 61 72 72 61 79 5f 65  6e 74 72 79 00 53 45 43  |_array_entry.SEC|
00001740  4f 4e 44 5f 50 52 4f 47  2e 63 00 5f 5f 46 52 41  |OND_PROG.c.__FRA|

您可以看到 SECOND_PROG.c 也嵌入到二进制文件中。

使用 objdump -s 转储两个可执行文件不会在任何地方显示这一点,这与您从程序中获得的干净 diff 相匹配。但是,使用 readelf -a 列出生成的可执行文件的内容 确实 显示:

Symbol table '.symtab' contains 66 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000400238     0 SECTION LOCAL  DEFAULT    1 
     2: 0000000000400254     0 SECTION LOCAL  DEFAULT    2 
     3: 0000000000400274     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000400298     0 SECTION LOCAL  DEFAULT    4 
     5: 00000000004002b8     0 SECTION LOCAL  DEFAULT    5 
     6: 0000000000400300     0 SECTION LOCAL  DEFAULT    6 
     7: 0000000000400338     0 SECTION LOCAL  DEFAULT    7 
     8: 0000000000400340     0 SECTION LOCAL  DEFAULT    8 
     9: 0000000000400360     0 SECTION LOCAL  DEFAULT    9 
    10: 0000000000400378     0 SECTION LOCAL  DEFAULT   10 
    11: 0000000000400390     0 SECTION LOCAL  DEFAULT   11 
    12: 00000000004003b0     0 SECTION LOCAL  DEFAULT   12 
    13: 00000000004003d0     0 SECTION LOCAL  DEFAULT   13 
    14: 00000000004003e0     0 SECTION LOCAL  DEFAULT   14 
    15: 0000000000400564     0 SECTION LOCAL  DEFAULT   15 
    16: 0000000000400570     0 SECTION LOCAL  DEFAULT   16 
    17: 0000000000400574     0 SECTION LOCAL  DEFAULT   17 
    18: 00000000004005a8     0 SECTION LOCAL  DEFAULT   18 
    19: 0000000000600e10     0 SECTION LOCAL  DEFAULT   19 
    20: 0000000000600e18     0 SECTION LOCAL  DEFAULT   20 
    21: 0000000000600e20     0 SECTION LOCAL  DEFAULT   21 
    22: 0000000000600e28     0 SECTION LOCAL  DEFAULT   22 
    23: 0000000000600ff8     0 SECTION LOCAL  DEFAULT   23 
    24: 0000000000601000     0 SECTION LOCAL  DEFAULT   24 
    25: 0000000000601020     0 SECTION LOCAL  DEFAULT   25 
    26: 0000000000601030     0 SECTION LOCAL  DEFAULT   26 
    27: 0000000000000000     0 SECTION LOCAL  DEFAULT   27 
    28: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
    29: 0000000000600e20     0 OBJECT  LOCAL  DEFAULT   21 __JCR_LIST__
    30: 0000000000400410     0 FUNC    LOCAL  DEFAULT   14 deregister_tm_clones
    31: 0000000000400450     0 FUNC    LOCAL  DEFAULT   14 register_tm_clones
    32: 0000000000400490     0 FUNC    LOCAL  DEFAULT   14 __do_global_dtors_aux
    33: 0000000000601030     1 OBJECT  LOCAL  DEFAULT   26 completed.7585
    34: 0000000000600e18     0 OBJECT  LOCAL  DEFAULT   20 __do_global_dtors_aux_fin
    35: 00000000004004b0     0 FUNC    LOCAL  DEFAULT   14 frame_dummy
    36: 0000000000600e10     0 OBJECT  LOCAL  DEFAULT   19 __frame_dummy_init_array_
    37: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS FIRST_PROG.c
    38: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
    39: 0000000000400698     0 OBJECT  LOCAL  DEFAULT   18 __FRAME_END__
    40: 0000000000600e20     0 OBJECT  LOCAL  DEFAULT   21 __JCR_END__
    41: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS 
    42: 0000000000600e18     0 NOTYPE  LOCAL  DEFAULT   19 __init_array_end
    43: 0000000000600e28     0 OBJECT  LOCAL  DEFAULT   22 _DYNAMIC
    44: 0000000000600e10     0 NOTYPE  LOCAL  DEFAULT   19 __init_array_start
    45: 0000000000400574     0 NOTYPE  LOCAL  DEFAULT   17 __GNU_EH_FRAME_HDR
    46: 0000000000601000     0 OBJECT  LOCAL  DEFAULT   24 _GLOBAL_OFFSET_TABLE_
    47: 0000000000400560     2 FUNC    GLOBAL DEFAULT   14 __libc_csu_fini
    48: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_deregisterTMCloneTab
    49: 0000000000601020     0 NOTYPE  WEAK   DEFAULT   25 data_start
    50: 0000000000601030     0 NOTYPE  GLOBAL DEFAULT   25 _edata
    51: 0000000000400564     0 FUNC    GLOBAL DEFAULT   15 _fini
    52: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@@GLIBC_
    53: 0000000000601020     0 NOTYPE  GLOBAL DEFAULT   25 __data_start
    54: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
    55: 0000000000601028     0 OBJECT  GLOBAL HIDDEN    25 __dso_handle
    56: 0000000000400570     4 OBJECT  GLOBAL DEFAULT   16 _IO_stdin_used
    57: 00000000004004f0   101 FUNC    GLOBAL DEFAULT   14 __libc_csu_init
    58: 0000000000601038     0 NOTYPE  GLOBAL DEFAULT   26 _end
    59: 00000000004003e0    42 FUNC    GLOBAL DEFAULT   14 _start
    60: 0000000000601030     0 NOTYPE  GLOBAL DEFAULT   26 __bss_start
    61: 00000000004004d6    11 FUNC    GLOBAL DEFAULT   14 main
    62: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _Jv_RegisterClasses
    63: 0000000000601030     0 OBJECT  GLOBAL HIDDEN    25 __TMC_END__
    64: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_registerTMCloneTable
    65: 0000000000400390     0 FUNC    GLOBAL DEFAULT   11 _init

注意条目 37 包含源文件的名称。如果您尝试 diffing readelf -a 的输出,您确实会得到一些非常有用的信息:

81c81
<   [28] .shstrtab         STRTAB           0000000000000000  0000189f
---
>   [28] .shstrtab         STRTAB           0000000000000000  000018a0
86c86
<        0000000000000207  0000000000000000           0     0     1
---
>        0000000000000208  0000000000000000           0     0     1
211c211
<     37: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS FIRST_PROG.c
---
>     37: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS SECOND_PROG.c
258c258
<     Build ID: 2c64961288049002e34a1f14e55d6c80dd96816c
---
>     Build ID: 5425dec81aae53bd30e85fe94659d320bb774dcc

似乎许多差异归结为源文件的名称不同。

所以我的官方回答是"this has nothing whatsoever to do with integer literals and is purely a function of compiling files with different names."