为什么我使用 NASM 得到的操作码不能被 bochs i386 CPU 正确执行?
Why the opcodes I get using NASM can't be executed correctly by bochs i386 CPU?
Abstract:
I found many output formats supported in NASM generate very sparse machine-code interlaced with 0s. Most importantly, they can't be correctly understood by bochs' i386 cpu.
I believe the fault is on myself, but don't know where and why.
我的来源:
cli
cli
mov ax,cs
mov ds,ax
mov es,ax
call ClearTty <- here
call ResetCursor <- here
mov al,43h ;'C'
call DispAL
jmp $
...
如果我输出“bin”格式:nasm -f bin boot.s -o boot.o
bin:
fafa 8cc8 8ed8 8ec0 e80a 00e8 2500 b043 <- No 0000 filled, GOOD
e838 00eb feb0 0066 5566 5450 5152 b406 <- No 0000 filled, GOOD
b900 008a 3685 00b2 50cd 105a 5958 665c
665d c3ba 0000 6655 6654 5053 b402 b700
cd10 5b58 665c 665d c3b0 4166 5566 5450
5351 b409 b700 b30f b901 00cd 1059 5b58
665c 665d c350 80fa 5072 07b2 00fe c6e9
0200 fec2 3a36 8500 7609 b001 e898 ff8a
看起来很紧凑,不错!可以正确执行。
这就是 NASM 认为应该为此 bin 格式生成的内容:
compile to bin
ADDRESS OPCODES DISASM
00000000 FA cli
00000001 FA cli
00000002 8CC8 mov ax,cs
00000004 8ED8 mov ds,ax
00000006 8EC0 mov es,ax
00000008 E80A00 call ClearS <- GOOD
0000000B E82500 call ResetCursor <- GOOD
好!这就是我想要的!
但是当我生成其他类型时(因为bin不支持链接)
例如ELF:nasm -f elf boot.s -o boot.o
[boot.elf: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped]
elf 352-bytes-header omitted
fafa 668c c88e d88e c0e8 0e00 0000 e82c <- 0000 WHY????
0000 00b0 43e8 3e00 0000 ebfe b000 5554 <- 0000 WHY????
6650 6651 6652 b406 66b9 0000 8a35 9f00
0000 b250 cd10 665a 6659 6658 5c5d c366
ba00 0055 5466 5066 53b4 02b7 00cd 1066
5b66 585c 5dc3 b041 5554 6650 6653 6651
b409 b700 b30f 66b9 0100 cd10 6659 665b
NASM 认为应该生成什么:
compile to elf
00000000 FA cli
00000001 FA cli
00000002 668CC8 mov ax,cs
00000005 8ED8 mov ds,ax
00000007 8EC0 mov es,ax
00000009 E80E000000 call ClearS <- Very long code ??
0000000E E82C000000 call ResetCursor <- Very long code ??
它是如何被cpu执行的:
00007eb0: cli ; fa
00007eb1: cli ; fa
00007eb2: mov ax, cs ; 668cc8
00007eb5: mov ds, ax ; 8ed8
00007eb7: mov es, ax ; 8ec0
00007eb9: call .+14 ; e80e00
00007ebc: add byte ptr ds:[bx+si], al ; 0000 WRONG!!! What is that?
00007ebe: call .+44 ; e82c00
00007ec1: add byte ptr ds:[bx+si], al ; 0000 WRONG!!!
00007ec3: mov al, 0x43 ; b043
00007ec5: call .+62 ; e83e00
00007ec8: add byte ptr ds:[bx+si], al ; 0000 WRONG!!!
00007eca: jmp .-2 ; ebfe
此外,如果我生成其他输出格式,如 Mach-O 或 Obj:
compile to other e.g. MachO [boot.o: Mach-O object i386]
00000000 FA cli
00000001 FA cli
00000002 668CC8 mov ax,cs
00000005 8ED8 mov ds,ax
00000007 8EC0 mov es,ax
00000009 E80E000000 call ClearS <- Still so long
0000000E E82C000000 call ResetCursor <- Still so long
还是错了。
我怎样才能把事情做好并生成可以被bochs i386正确执行的代码cpu。或者我如何调整 bochs 以便它可以执行该代码。
my bochsrc: cpuid: level=6, mmx=1, apic=xapic, sep=1, aes=1, movbe=1,
simd=ssse3, misaligned_sse=1
简而言之:因为ELF不支持16位代码class.
长答案:
哦,那是因为 NAsm 生成了一个 32 位的精灵图像。
F:\dev>objdump -D test
test: file format elf32-i386
Disassembly of section .text:
00000000 <ClearTty-0x13>:
0: fa cli
1: fa cli
2: 66 8c c8 mov %cs,%ax
5: 8e d8 mov %eax,%ds
7: 8e c0 mov %eax,%es
9: e8 05 00 00 00 call 13 <ClearTty>
e: e8 01 00 00 00 call 14 <ResetCursor>
00000013 <ClearTty>:
13: c3 ret
00000014 <ResetCursor>:
14: f4 hlt
6.1 bin: Flat-Form Binary Output
Using the bin format puts NASM by default into 16-bit mode.
7.9.7 16-bit code and ELF
The ELF32 specification doesn't provide relocations for 8- and 16-bit values, but the GNU ld linker adds these as an extension. NASM can generate GNU-compatible relocations, to allow 16-bit code to be linked as ELF using GNU ld. If NASM is used with the -w+gnu-elf-extensions option, a warning is issued when one of these relocations is generated.
如果您有 BITS 16
,那么它会生成带有 16 位代码的 32 位 ELF 映像。
看看这个:
test: file format elf32-i386
Disassembly of section .text:
00000000 <ClearTty-0xe>:
0: fa cli
1: fa cli
2: 8c c8 mov %cs,%eax
4: 8e d8 mov %eax,%ds
6: 8e c0 mov %eax,%es
8: e8 03 00 e8 01 call 1e80010 <ResetCursor+0x1e80001>
; e8 03 00 e8 01 should be e8 03 00 and e8 01 00 <-- two call instructions
但是格式还是elf32-i386。现在的问题是为什么?让我们看看 ELF 文档
http://www.skyfree.org/linux/references/ELF_Format.pdf
EI_CLASS
The next byte, e_ident[EI_CLASS]
, identifies the file’s class, or capacity.
The file format is designed to be portable among machines of various sizes, without imposing the sizes of the largest machine on the smallest. Class ELFCLASS32
supports machines with files and virtual address spaces up to 4 gigabytes; it uses the basic types
defined above.
Class ELFCLASS64
is reserved for 64-bit architectures. Its appearance here shows how the object file may change, but the 64-bit format is otherwise unspecified. Other classes will be defined as necessary, with different basic types and sizes for object file data
所以,ELF 不支持 16 位代码 class!
Abstract:
I found many output formats supported in NASM generate very sparse machine-code interlaced with 0s. Most importantly, they can't be correctly understood by bochs' i386 cpu.
I believe the fault is on myself, but don't know where and why.
我的来源:
cli
cli
mov ax,cs
mov ds,ax
mov es,ax
call ClearTty <- here
call ResetCursor <- here
mov al,43h ;'C'
call DispAL
jmp $
...
如果我输出“bin”格式:nasm -f bin boot.s -o boot.o
bin:
fafa 8cc8 8ed8 8ec0 e80a 00e8 2500 b043 <- No 0000 filled, GOOD
e838 00eb feb0 0066 5566 5450 5152 b406 <- No 0000 filled, GOOD
b900 008a 3685 00b2 50cd 105a 5958 665c
665d c3ba 0000 6655 6654 5053 b402 b700
cd10 5b58 665c 665d c3b0 4166 5566 5450
5351 b409 b700 b30f b901 00cd 1059 5b58
665c 665d c350 80fa 5072 07b2 00fe c6e9
0200 fec2 3a36 8500 7609 b001 e898 ff8a
看起来很紧凑,不错!可以正确执行。
这就是 NASM 认为应该为此 bin 格式生成的内容:
compile to bin
ADDRESS OPCODES DISASM
00000000 FA cli
00000001 FA cli
00000002 8CC8 mov ax,cs
00000004 8ED8 mov ds,ax
00000006 8EC0 mov es,ax
00000008 E80A00 call ClearS <- GOOD
0000000B E82500 call ResetCursor <- GOOD
好!这就是我想要的!
但是当我生成其他类型时(因为bin不支持链接)
例如ELF:nasm -f elf boot.s -o boot.o
[boot.elf: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped]
elf 352-bytes-header omitted
fafa 668c c88e d88e c0e8 0e00 0000 e82c <- 0000 WHY????
0000 00b0 43e8 3e00 0000 ebfe b000 5554 <- 0000 WHY????
6650 6651 6652 b406 66b9 0000 8a35 9f00
0000 b250 cd10 665a 6659 6658 5c5d c366
ba00 0055 5466 5066 53b4 02b7 00cd 1066
5b66 585c 5dc3 b041 5554 6650 6653 6651
b409 b700 b30f 66b9 0100 cd10 6659 665b
NASM 认为应该生成什么:
compile to elf
00000000 FA cli
00000001 FA cli
00000002 668CC8 mov ax,cs
00000005 8ED8 mov ds,ax
00000007 8EC0 mov es,ax
00000009 E80E000000 call ClearS <- Very long code ??
0000000E E82C000000 call ResetCursor <- Very long code ??
它是如何被cpu执行的:
00007eb0: cli ; fa
00007eb1: cli ; fa
00007eb2: mov ax, cs ; 668cc8
00007eb5: mov ds, ax ; 8ed8
00007eb7: mov es, ax ; 8ec0
00007eb9: call .+14 ; e80e00
00007ebc: add byte ptr ds:[bx+si], al ; 0000 WRONG!!! What is that?
00007ebe: call .+44 ; e82c00
00007ec1: add byte ptr ds:[bx+si], al ; 0000 WRONG!!!
00007ec3: mov al, 0x43 ; b043
00007ec5: call .+62 ; e83e00
00007ec8: add byte ptr ds:[bx+si], al ; 0000 WRONG!!!
00007eca: jmp .-2 ; ebfe
此外,如果我生成其他输出格式,如 Mach-O 或 Obj:
compile to other e.g. MachO [boot.o: Mach-O object i386]
00000000 FA cli
00000001 FA cli
00000002 668CC8 mov ax,cs
00000005 8ED8 mov ds,ax
00000007 8EC0 mov es,ax
00000009 E80E000000 call ClearS <- Still so long
0000000E E82C000000 call ResetCursor <- Still so long
还是错了。
我怎样才能把事情做好并生成可以被bochs i386正确执行的代码cpu。或者我如何调整 bochs 以便它可以执行该代码。
my bochsrc: cpuid: level=6, mmx=1, apic=xapic, sep=1, aes=1, movbe=1, simd=ssse3, misaligned_sse=1
简而言之:因为ELF不支持16位代码class.
长答案: 哦,那是因为 NAsm 生成了一个 32 位的精灵图像。
F:\dev>objdump -D test
test: file format elf32-i386
Disassembly of section .text:
00000000 <ClearTty-0x13>:
0: fa cli
1: fa cli
2: 66 8c c8 mov %cs,%ax
5: 8e d8 mov %eax,%ds
7: 8e c0 mov %eax,%es
9: e8 05 00 00 00 call 13 <ClearTty>
e: e8 01 00 00 00 call 14 <ResetCursor>
00000013 <ClearTty>:
13: c3 ret
00000014 <ResetCursor>:
14: f4 hlt
6.1 bin: Flat-Form Binary Output
Using the bin format puts NASM by default into 16-bit mode.7.9.7 16-bit code and ELF
The ELF32 specification doesn't provide relocations for 8- and 16-bit values, but the GNU ld linker adds these as an extension. NASM can generate GNU-compatible relocations, to allow 16-bit code to be linked as ELF using GNU ld. If NASM is used with the -w+gnu-elf-extensions option, a warning is issued when one of these relocations is generated.
如果您有 BITS 16
,那么它会生成带有 16 位代码的 32 位 ELF 映像。
看看这个:
test: file format elf32-i386
Disassembly of section .text:
00000000 <ClearTty-0xe>:
0: fa cli
1: fa cli
2: 8c c8 mov %cs,%eax
4: 8e d8 mov %eax,%ds
6: 8e c0 mov %eax,%es
8: e8 03 00 e8 01 call 1e80010 <ResetCursor+0x1e80001>
; e8 03 00 e8 01 should be e8 03 00 and e8 01 00 <-- two call instructions
但是格式还是elf32-i386。现在的问题是为什么?让我们看看 ELF 文档
http://www.skyfree.org/linux/references/ELF_Format.pdf
EI_CLASS
The next byte,e_ident[EI_CLASS]
, identifies the file’s class, or capacity.
The file format is designed to be portable among machines of various sizes, without imposing the sizes of the largest machine on the smallest. ClassELFCLASS32
supports machines with files and virtual address spaces up to 4 gigabytes; it uses the basic types defined above.
ClassELFCLASS64
is reserved for 64-bit architectures. Its appearance here shows how the object file may change, but the 64-bit format is otherwise unspecified. Other classes will be defined as necessary, with different basic types and sizes for object file data
所以,ELF 不支持 16 位代码 class!