为什么我的 C 代码抛出分段错误，即使 return 指针指向看似有效的 shellcode 的内存地址？

Question

我正在尝试学习有关缓冲区溢出的教程（Vivek Ramachandran 的缓冲区溢出入门）。我确实在遵循他的代码，该代码在演示中对他有用，并且在这一点上对我有用。

下面C程序的目标是将exit系统调用的shellcode分配给一个变量，然后替换main函数默认的return地址，指向__lib_start_main ，加上shellcode变量的内存地址，使得程序在完成main函数后执行shellcode，然后优雅地退出程序，值为20（如执行"exit(20)"）。不幸的是，程序以分段错误结束。我在 32 位 Linux Mint 上运行ning 这个。我正在使用 gcc 编译代码，并使用 --ggdb 和 -mpreferred-stack-boundary=2 选项对其进行了编译，并且我尝试了使用和不使用 -fno-stack-protector 选项。

代码如下：

#include<stdio.h>

char shellcode[] = "\xbb\x16\x00\x00\x00"
                   "\xb8\x01\x00\x00\x00"
                   "\xcd\x80";

int main(){

        int *ret;

        ret = (int *)&ret +2;

        (*ret) = (int)shellcode;

}

首先定义一个名为 shellcode 的变量，该变量保存 shellcode。
main函数被调用，定义了ret变量，加载到栈顶
ret变量的内存位置，加上2个整数空格，表示栈下8个字节的内存位置（return指针的地址）赋值为ret 变量。
shellcode 变量的内存地址被写入由 ret 变量的值表示的内存地址 - 即 - return 地址。
函数执行到return指令时，执行shellcode，即退出函数

我通过 gdb 运行这个，一切似乎都检查出来了： The memory location of the shellcode variable is 0x804a01c

At the start of the execution of main, the return value is at the 3rd hex-word and points to __lib_start_main

After executing ret = (ret *)&ret +2 , the value of ret in on the stack and is 8 bytes more than the beginning of the stack

After executing (*ret) = (int)shellcode , the return pointer (3rd hex-word) contains the address of the shellcode, rather than __lib_start_main

The program seems to move to resume execution at the memory address of the shellcode, but nevertheless ends in a segmentation fault.

提前致谢！

Answer 1

编译时添加以下选项解决了问题：

-z execstack

Answer 2

传统的缓冲区溢出攻击确实涉及在堆栈上执行代码，但您的程序不会那样做。您的 shellcode 数组不在堆栈上，您用来破坏 main 的 return 地址以指向 shellcode 数组的构造不涉及在堆。当我运行你的程序在我的 Linux 盒子上（也在 x86 CPU 上运行ning），用 gcc -O0 -m32 编译时，它确实设置了 EIP注册指向 shellcode 中的机器码。但是，正如它对您所做的那样，它会因分段错误而崩溃。

它崩溃的原因是因为 shellcode 被加载到标记为 不可执行 的内存区域。（此内存区域的名称是 "the data segment"。）处理器拒绝从该区域执行机器指令，而是生成一个 "exception"（这是一个硬件概念，与 C++ 异常不同）内核转换为 SIGSEGV 信号。

关于编写 shellcode 和缓冲区溢出漏洞利用的旧教程不会警告您这种可能性，因为老一代的 x86 架构无法在 per-page 基础上将内存标记为不可执行。在大多数基于 x86 的 32 位操作系统使用的 "flat" segment-register 配置中，任何可读的页面也是可执行的。然而，最近几代的架构已经能够将个别页面标记为不可执行，您必须解决这个问题。（如果我没记错的话，per-page 可执行性大约在 2003 年与 64 位模式同时被添加到 x86 架构中，但是操作系统支持变得普遍需要更长的时间。）

在我的 Linux 框中，如上所述，您的程序的这个修改版本成功地将控制权转移到 shellcode 中的机器代码并执行该机器代码。它使用 mprotect 系统调用使包含 shellcode 的内存区域可执行。

#include <stdint.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>

const char shellcode[] =
    "\xbb\x16\x00\x00\x00"
    "\xb8\x01\x00\x00\x00"
    "\xcd\x80";

int main(void)
{
  uintptr_t pagesize = sysconf(_SC_PAGESIZE);
  if (mprotect((void *)(((uintptr_t)shellcode) & ~(pagesize - 1)),
               pagesize, PROT_READ|PROT_EXEC)) {
    perror("mprotect");
    return 1;
  }

  void **ret;
  ret = (void **) &ret;
  ret[9] = (void *)shellcode;

  return 0;
}

除了 mprotect 操作本身，请注意添加该代码块如何改变堆栈布局并将 return 地址放在不同的位置。如果我在启用优化的情况下进行编译，堆栈布局将再次更改并且 return 地址不会被覆盖。另请注意我是如何将 shellcode 设为 const char 的。如果我没有这样做，我将需要在 mprotect 调用中使用 PROT_READ|PROT_WRITE|PROT_EXEC 以避免过早崩溃，因为当 C 库期望它是时，一些随机全局变量突然不可写，然后内核可能由于“W^X”安全策略而导致 mprotect 调用失败。

根据您的内核和 C 库的年龄，将 shellcode 设为 const char 本身可能就足够了，但是对于内核 4.19 和 glibc 2.28，这就是我所拥有的，read-only 数据也不可执行。

Answer 3

你的 SHellcode 包含空字节，尝试使用最小的寄存器并在你需要将寄存器清零时使用 xor，空字节问题是当 C 看到这个空字节时，它在这个空字节后停止读取 '\x00 '，导致执行问题，如分段错误。

为什么我的 C 代码抛出分段错误，即使 return 指针指向看似有效的 shellcode 的内存地址？

Why is my C code throwing a segmentation fault even though the return pointer points to a memory address for seemingly valid shellcode?

c

stack

buffer-overflow

shellcode