将汇编程序编译为平面形式的二进制文件包含其他格式中不存在的无关 'f' 字符

Question

我正在编写一个用汇编编写的程序：

xor eax, eax        ; make eax equal to 0
push eax            ; pushes null
push 0x68732f2f     ; pushes /sh (//)
push 0x6e69622f     ; pushes /bin
mov ebx, esp        ; passes the first argument
push eax            ; empty third argument
mov edx, esp        ; passes the third argument
push eax            ; empty second argument
mov ecx, esp        ; passes the second argument
mov al, 11          ; execve system call #11
int 0x80            ; makes  an interrupt

当使用 nasm 作为平面形式的二进制文件编译时，我在程序的十六进制表示中看到无关的 f 字符。我期待看到：

0000000: 31c0 5068 2f2f 7368 682f 6269 6e89 e350  1.Ph//shh/bin..P                                  
0000010: 89e2 5089 e1b0 0bcd 80                   ..P......

但实际看到：

0000000: 6631 c066 5066 682f 2f73 6866 682f 6269  f1.fPfh//shfh/bi
0000010: 6e66 89e3 6650 6689 e266 5066 89e1 b00b  nf..fPf..fPf....
0000020: cd80                                     ..

奇怪的是，当我尝试使用 nasm 以另一种格式（例如 ELF-32）编译我的程序时，我看到了我期望的十六进制表示（尽管我可能不应该包含很多其他十六进制我的解决方案）：

0000000: 7f45 4c46 0101 0100 0000 0000 0000 0000  .ELF............
0000010: 0100 0300 0100 0000 0000 0000 0000 0000  ................
0000020: 4000 0000 0000 0000 3400 0000 0000 2800  @.......4.....(.
0000030: 0500 0200 0000 0000 0000 0000 0000 0000  ................
0000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000060: 0000 0000 0000 0000 0100 0000 0100 0000  ................
0000070: 0600 0000 0000 0000 1001 0000 1900 0000  ................
0000080: 0000 0000 0000 0000 1000 0000 0000 0000  ................
0000090: 0700 0000 0300 0000 0000 0000 0000 0000  ................
00000a0: 3001 0000 2100 0000 0000 0000 0000 0000  0...!...........
00000b0: 0100 0000 0000 0000 1100 0000 0200 0000  ................
00000c0: 0000 0000 0000 0000 6001 0000 3000 0000  ........`...0...
00000d0: 0400 0000 0300 0000 0400 0000 1000 0000  ................
00000e0: 1900 0000 0300 0000 0000 0000 0000 0000  ................
00000f0: 9001 0000 1000 0000 0000 0000 0000 0000  ................
0000100: 0100 0000 0000 0000 0000 0000 0000 0000  ................
0000110: 31c0 5068 2f2f 7368 682f 6269 6e89 e350  1.Ph//shh/bin..P
0000120: 89e2 5089 e1b0 0bcd 8000 0000 0000 0000  ..P.............
0000130: 002e 7465 7874 002e 7368 7374 7274 6162  ..text..shstrtab
0000140: 002e 7379 6d74 6162 002e 7374 7274 6162  ..symtab..strtab
0000150: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000160: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000170: 0100 0000 0000 0000 0000 0000 0400 f1ff  ................
0000180: 0000 0000 0000 0000 0000 0000 0300 0100  ................
0000190: 0073 6865 6c6c 7370 6177 6e2e 6173 6d00  .shellspawn.asm.

对于我想要完成的事情，我认为我必须使用平面形式的二进制文件。

我的问题：这个f字符是从哪里来的，是什么意思；我该如何删除它？

NASM version 2.10.09 compiled on Dec 29 2013

xxd V1.10 27oct98 by Juergen Weigert

Answer 1

我在前面加上，我不是 100% 确定，但是...

根据This website，它是一个操作数大小覆盖前缀。我猜汇编程序会格外谨慎，或者它可能是为 32 位汇编程序输出设置的，并且尝试确保您的 xor 和 mov 指令是剩余的乐趣大小合适。它似乎实际上不会影响运行时，如果该站点是正确的，那么也许只是额外的 window 敷料。

我会仔细检查传递给 nasm 的标志，以确保您不会不小心告诉它进行超出预期的移动。

或者希望从最近的 x86 asm 编码器那里得到更深入的答案:)

Answer 2

nasm -f bin 将默认模式设置为 16 位。在该模式下，32 位操作数大小指令（如 xor eax,eax 或 push eax）必须使用 0x66 操作数大小前缀进行编码。请参阅 16、32 或 64 位模式下的 this table of modes vs. prefixes for operand-size and address-size. (f is 0x66 in ASCII). See also links to x86 manuals in the x86 tag wiki. See also how to disassemble flat binaries。

-f elf 是 -felf32 的同义词，因此它的目标是 32 位模式。 -felf64 针对 64 位模式（其中 push eax 不可编码）。

见section 6.1 in the NASM manual: BITS 32 tells the assembler to assembler for 32-bit mode, overriding the default based on the output file format. IDK if there's a command-line option that makes flat binaries with 32-bit code. I don't see one in the man page or --help. If for some reason you really didn't want to change your sources, you could use -felf and use ld --oformat binary to link flat binaries. See How to generate plain binaries like nasm -f bin with the GNU GAS assembler?

啊，文档暗示 BITS 32 是唯一的方法：

The most likely reason for using the BITS directive is to write 32-bit or 64-bit code in a flat binary file;

实际代码的代码审查：

如果您只想将指向 NULL 的指针作为第二个和第三个参数传递，为什么不让它们都指向第一个 push eax 结果（这也是字符串终止符）？即 xor-zero / push / mov edx, esp / mov ecx, esp.

此外，the man page 表示您实际上可以传递 argv=NULL 和 envp=NULL（但警告说它不可移植并且不要依赖它）。所以你可以 xor edx,edx / push edx / ... / mov ecx,edx / lea eax, [edx+11].

将汇编程序编译为平面形式的二进制文件包含其他格式中不存在的无关 'f' 字符

Compiling assembly program to flat-form binary includes extraneous 'f' chars that don't exist in other formats

assembly

hex

nasm

elf

bin