如何在没有链接 libc.so 的情况下访问段寄存器？

Question

我正在尝试在 Ubuntu 20.10 上使用 NASM 版本 2.15.04 在 64 位程序集中编写一个简单的堆栈金丝雀。使用命令 nasm -felf64 canary.asm && ld canary.o.

进行汇编和链接时，执行以下代码会导致分段错误

            global  _start

            section .text
_start:     endbr64
            push    rbp                     ; Save base pointer
            mov     rbp, rsp                ; Set the stack pointer
            call    _func                   ; Call _func
            mov     rdi, rax                ; Save return value of _func in RDI
            mov     rax, 0x3c               ; Specify exit syscall 
            syscall                         ; Exit

_func:      endbr64
            push    rbp                     ; Save the base pointer
            mov     rbp, rsp                ; Set the stack pointer
            sub     rsp, 0x8                ; Adjust the stack pointer
            mov     rax,  qword fs:[0x28]   ; Get stack canary
            mov     qword [rbp - 0x8], rax  ; Save stack canary on the stack
            xor     eax, eax                ; Clear RAX
            mov     rax, 0x1                ; Specify write syscall
            mov     rdi, 0x1                ; Specify stdout
            mov     rsi, msg                ; Char* buffer to print
            mov     rdx, 0xd                ; Length of the buffer
            syscall                         ; Write msg
            mov     rax, qword [rbp - 0x8]  ; Retrieve the stack canary
            xor     rax, qword fs:[0x28]    ; Compare to original value    
            je      _return                 ; Jump to _return if canary matched original
            xor     eax, eax                ; Clear RAX
            mov     rax, 0x1                ; Specify write syscall 
            mov     rdi, 0x1                ; Specify stdout
            mov     rsi, stack_fail         ; Char* buffer to print
            mov     rdx, 0x18               ; Length of the buffer 
            syscall                         ; Write stack_fail
            mov     rax, 0x3c               ; Specify exit syscall
            mov     rax, 0x1                ; Specify error code 1    
            syscall                         ; Exit

_return:    xor     eax, eax                ; Set return value to 0
            add     rsp, 0x8                ; Reset stack pointer
            pop     rbp                     ; Get original base pointer
            ret                             ; Return 

            section .data
msg:        db      "Hello, World", 0xa, 0x0
stack_fail  db      "Stack smashing detected", 0xa, 0x0

使用 GDB 调试显示第 16 行发生段错误：mov rax, qword fs:[0x28].

─────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
     0x40101b <_func+4>        push   rbp
     0x40101c <_func+5>        mov    rbp, rsp
     0x40101f <_func+8>        sub    rsp, 0x8
 →   0x401023 <_func+12>       mov    rax, QWORD PTR fs:0x28
     0x40102c <_func+21>       mov    QWORD PTR [rbp-0x8], rax
     0x401030 <_func+25>       xor    eax, eax
     0x401032 <_func+27>       mov    eax, 0x1
     0x401037 <_func+32>       mov    edi, 0x1
     0x40103c <_func+37>       movabs rsi, 0x402000
─────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "a.out", stopped 0x401023 in _func (), reason: SIGSEGV

然而，通过 nasm -felf64 canary.asm && ld canary.o -lc -dynamic-linker /usr/lib64/ld-linux-x86-64.so.2 与 libc 进行汇编和动态链接会导致执行成功，不再导致分段错误。

使用 Radare2 比较最终的二进制文件显示两个版本将问题指令组装为相同的：

0x00401023 64488b042528. mov rax, qword fs:[0x28]

两种情况下的 GDB 还显示 FS 寄存器在执行该指令时为 0x0000。

因此无论二进制文件是否与 libc 链接并且代码没有使用 libc 的外部符号，指令字节和 FS 寄存器都是相同的。为什么链接libc会导致执行成功，而不链接libc会导致段错误？有没有可能 and/or 我如何在不链接 libc 的情况下实现它？

注意：本例中stack canary的相关性或需求不是问题的重点。

Answer 1

访问段寄存器没有问题，只是mov eax, fs。但是你想要做的是在 FS 段 base 的一个小偏移处访问线程本地存储，其中 libc init 东西将要求内核设置。

最简单的方法是使用普通的 RIP 相对寻址模式访问您的堆栈金丝雀，而不是相对于 FS 基础，就像 GCC 在针对其他 ISA 时所做的那样。只有当你想让其他一些漏洞利用更难到达金丝雀（并且它的地址可以单独随机化）时，你才需要 TLS。（或者库代码可以访问它而无需从 GOT 加载指针的间接访问，而不是仅对主要可执行文件中的代码有效。）

如果您想复制 GCC 的 stack-canary 代码，您当然可以进行与 libc 相同的系统调用来设置线程本地存储并使用它。

有趣的事实：sub rax, qword fs:[0x28] 是一种比 XOR 更有效的检查金丝雀的方法——它可以与 JCC 宏融合成一个 uop。这就是当前 GCC 改为使用 sub 的原因。 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90568 - 在 GCC10+ 中修复。

我的 GCC 错误报告实际上包括独立的微基准测试代码（以证明 sub 即使在 FS: 寻址模式下也可以进行宏融合）。

如果静态可执行文件中没有 libc，它会设置 FS 段，因此其基地址是缓冲区的地址，因此 [fs: 0x28] 将起作用。这是 TLS 的基本形式。

global _start
_start:

cookie equ 12345
    mov  eax, 158       ; __NR_arch_prctl
    mov  edi, 0x1002    ; ARCH_SET_FS
    lea  rsi, [buf]
    syscall

    mov  qword [fs: 0x28], cookie

...


section .bss
buf:    resb 4096         ; fs.base will point at this buffer

如果内核启用 wrfsbase 供用户 space 使用，您可以使用 wrfsbase rsi 而不是进行系统调用。我认为最新的 Linux 内核 (5.10) 可能已经开始使用 wrfsbase 本身，但我不知道它是否允许用户 space 使用它。

（它可能不会在每次使用时切换 FSGSBASE on/off，因此内核使用意味着用户-space 可以使用它；故障条件 in the manual 不不提特权级，只提CPUID特性位和CR4控制寄存器中的位。而且只在64位模式下；在其他模式包括兼容模式下会#UD。）

如何在没有链接 libc.so 的情况下访问段寄存器？

How to access segment register with out linking libc.so?

x86-64

nasm

segmentation-fault

thread-local-storage

memory-segmentation