调用后执行的db指令

Question

我正在学习用汇编语言编程，但我发现这段代码我无法理解指令是如何执行的

xor eax,eax
xor ebx,ebx
xor ecx,ecx
xor edx,edx
jmp short string
code:
pop ecx 
mov bl,1
mov dl,13
mov al,4
int 0x80
dec bl
mov al,1
int 0x80
string: 
call code 
db 'hello, world!'

代码调用后，为什么会执行db指令？如果调用指令在

之前执行

Answer 1

该字符串永远不会执行，因为您在到达该点之前发出 exit_program 系统调用。

这是正常代码的样子：

asm
asm
more asm
call subroutine   ->> will branch to subroutine
        subroutine:  asm
                     more asm
                     ret   ->> executing will return to point after call.
xor eax,eax          <<-- first instruction after the ret
ret                  <<-- return to caller.

这是 code 子例程的注释版本。

code:
pop ecx        ;get 'hello world'
mov bl,1       ;write to stdout
mov dl,13      ;length is 14
mov al,4       ;syscall write
int 0x80       ;perform the write
dec bl         ;set exit code to 0 {success}
mov al,1       ;syscall exit
int 0x80       ;exit the program
;the executing does not return here!
string: 
call code 
db 'hello, world!'

更好的方法是将最后一部分编码如下：

.code
hello: db 'hello, world!'

.text
move ecx,hello
mov edx,13
call print         //returns as normal
call exit_program  //will not return

print: mov eax,4
mov ebx,1
int 0x80
ret

exit_program: 
xor ebx,ebx
mov eax,1
int 0x80

这有以下好处：

指令较少；
不匹配calls/returns
它不会混淆字节寄存器
简单易懂。
您可以在程序的其他部分重用 print 和 exit_program 子例程。

Answer 2

只是为了指出我所说的 "other way defining byte values" 的意思，你的代码的这个变体将做同样的事情，但它显示了如何通过指令定义字符串，以及如何通过 [=11= 定义指令] 指令 ... 两者都使人类更难阅读源代码，但对于汇编器而言，差异可以忽略不计，它会产生相同的二进制机器代码，并且对于 CPU 相同的机器代码是相同的机器代码，它不关心你的来源看起来如何。

我还尝试对每一行、它的作用以及在代码中使用它的原因进行广泛注释。

代码也是以这种non-trivial方式编写的，因为它是shell-exploit有效负载的示例，其中您的程序集不仅必须执行您想要的操作，而且其生成的机器代码也必须符合额外的约束，比如它不能包含任何零（使得在使用某些漏洞注入有效负载代码期间很难将其作为 "string" 传递），它必须是 PIC（position-independent 代码），并且它不能使用任何绝对地址，也不能在执行时占据任何特定位置等。

    ; sets basic registers eax,ebx,ecx,edx to zero (ecx not needed BTW)
    xor eax,eax
    db '1', 0xDB        ; xor ebx,ebx defined by "db" for fun
    db '1', 0xC9        ; xor ecx,ecx defined by "db" for fun
    xor edx,edx
    ; short-jump forward to make later "call code" to produce
    ; negative relative offset, so zero in "call" opcode is avoided
    ; "call code" from here would need zeroes in rel32 offset encoding
    jmp short string    ; the "jmp short string" is encoded as "EB 0F"
code:
    pop ecx             ; loads the address of string from the stack into ecx
    mov bl,1            ; ebx = 1 = STD_OUT stream, avoiding zeroes in
        ; "mov ebx,1" opcode, so instead "xor ebx,ebx mov bl,1" is used
    mov dl,13           ; edx = 13 = length of string
    mov al,4            ; eax = 4 = sys_write
    int 0x80            ; sys_write(STD_OUT, 'hello, world!', 13);
    dec bl              ; ebx = 0 = exit code "OK"
    mov al,1            ; eax = 1 = sys_exit
    int 0x80            ; sys_exit(0);
string:
    call code           ; return address == string address -> pushed on stack
    ; also "code:" is ahead, so relative offset is negative => no zero in opcode
    ; resulting call opcode is "E8 EC FF FF FF"

    ; following bytes are NOT executed as code, they contain string data
    push 0x6f6c6c65     ; 'hello'
    sub al,0x20         ; ', '
    ja  short $+0x6f+2  ; 'wo'
    jb  short $+0x6c+2  ; 'rl'
    db 'd!'

为了编译我确实使用了 nasm -f elf *.asm; ld -m elf_i386 -s -o demo *.o（忽略警告），向后反编译并检查实际机器代码是如何形成指令的，您可以应用 objdump -M intel -d demo.

（上面的代码和 objdump 也适用于在线站点：http://www.tutorialspoint.com/compile_assembly_online.php 如果你想测试一下）

调用后执行的db指令

db instruction it is executed after call

x86

assembly

nasm

shellcode