AddressSanitizer 中的 "shadow bytes" 是什么,我应该如何解释它们?

What are "shadow bytes" in AddressSanitizer and how should I interpret them?

我正在调试一个 C 程序,当 AddressSanitizer 发现问题时,我对它输出的下半部分感到非常困惑。让我们以此为例:

==33184==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000005 at pc 0x55f312fe2509 bp 0x7ffc99f5f5c0 sp 0x7ffc99f5f5b0
WRITE of size 1 at 0x602000000005 thread T0
    #0 0x55f312fe2508 in main /home/user/c/friends/main.c:20
    #1 0x7fa5ea0e9b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #2 0x55f312fe21c9 in _start (/home/user/c/friends/cmake-build-debug/friends+0x11c9)

0x602000000005 is located 11 bytes to the left of 5-byte region [0x602000000010,0x602000000015)
allocated by thread T0 here:
    #0 0x7fa5eb2b8b40 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb40)
    #1 0x55f312fe23f4 in main /home/user/c/friends/main.c:18
    #2 0x7fa5ea0e9b96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

SUMMARY: AddressSanitizer: heap-buffer-overflow /home/user/c/friends/main.c:20 in main

  0x0c047fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000:[fa]fa 05 fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==33184==ABORTING

这条线以上的一切,我明白了: SUMMARY: AddressSanitizer: heap-buffer-overflow /home/user/c/friends/main.c:20 in main

我的问题涉及该行下方显示的数据。我读了 this answer 但它没有回答我的问题。 ASAN 显示的内存转储如下所示:

  0x0c047fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000:[fa]fa 05 fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  1. 带箭头的线是什么意思?我的假设是出现在 fa 之间的 05 指的是 0x602000000005 is located 11 bytes to the left of 5-byte region“5 字节区域”。但是,我仍然很困惑,因为传说中说 fa 表示 "heap left redzone," 但它出现在 右边 05 在它的左边。为什么没有"heap right redzones?"

  2. 在这个例子中,ASAN 表示程序超出了 5 字节区域的 11 个字节,但它显示的 fa 远远多于此。

  3. 是否有任何适当、详细的文档实际解释这些术语 "heap left redzone"、"stack mid redzone"、"Global redzone" 等的含义?我没能找到。

  4. 在此上下文中 "Shadow byte/address" 是什么?

What are “shadow bytes” in AddressSanitizer and how should I interpret them?

来自AddressSanitizerAlgorithm page on GitHub (which is also linked from the LLVM AddressSanitizer page):

The virtual address space is divided into 2 disjoint classes:

  • Main application memory (Mem): this memory is used by the regular application code.
  • Shadow memory (Shadow): this memory contains the shadow values (or metadata). There is a correspondence between the shadow and the main application memory. Poisoning a byte in the main memory means writing some special value into the corresponding shadow memory.

所以“影子字节”是描述程序可寻址内存状态的元数据。

如果我们查看 asan 输出:

Shadow byte legend (one shadow byte represents 8 application bytes):

它告诉我们 hexdump 是影子内存,它描述了程序“真实”内存的状态。它跟踪哪些状态?

  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  ...

因此,如果整个 8 字节行是可寻址的,则跟踪(或阴影)它的影子字节应该具有值 00。如果它是部分可寻址的,则影子字节将为 01..07,这大概是行中可寻址字节的数量。

十六进制转储指向您的值是 fa,或“Heap left redzone”——大概这是堆分配周围的某种保护区,用于检测溢出。

来自同一个link:

The run-time library replaces the malloc and free functions. The memory around malloc-ed regions (red zones) is poisoned

更广泛地说,这个描述(在程序地址中)

0x602000000005 is located 11 bytes to the left of 5-byte region
  [0x602000000010,0x602000000015)

匹配显示的阴影贴图:

=>0x0c047fff8000:[fa]fa 05 fa ...

假设自然对齐,

  • shadow byte 0x0c047fff8000 描述(或者再次阴影)程序地址 0x602000000000..0x602000000007 其中包括您访问的地址
  • 0x0c047fff8001 处的下一个影子字节描述了程序地址 0x602000000008..0x60200000000F
  • 两者都有值 fa,意思是“堆左 redzone”
  • 0x0c047fff8002 处的 下一个 影子字节描述了程序地址 0x602000000010..0x602000000007 并且具有值 05,这意味着 5 个字节是可寻址的。这些是您的堆分配的 5 个字节。

所有这些都与您确实理解的错误部分一致。

  1. However, I am still confused because the legend says that fa means "heap left redzone," yet it appears to the right of the 05 and to the left of it. Why are there no "heap right redzones?"

    我不知道这里的方向性到底是什么意思。堆最初通常在一个方向上增长(传统上随着堆栈向下增长而向上),但可以被分割、释放、合并和重新分配。两个分配之间的间隔是“右”还是“左”,或者两者兼而有之?我们只需要知道它是一个从未分配给用户的中毒堆区域。

    如果没有对应于堆栈 left/mid/right 值的方向,也许它应该只是“Heap redzone”。

  2. In this example, ASAN says that the program went 11 bytes out of the 5-byte region, yet it shows far more fas than that.

    每个fa代表八个字节,正如传说所说。因此,如果您在分配之前访问了 9 到 15 个字节的任何内容(模算术错误),它就会显示在相同的影子字节中。如果您之前访问过一到八个字节,它会显示在 下一个 影子字节中(就在 05 之前)。

    其余 fa 只是一张周边地区的地图,在这种情况下似乎没有用,但在其他情况下可能有用。

  3. Is there any proper, detailed documentation which actually explains what these terms "heap left redzone", "stack mid redzone", "Global redzone", etc mean?

    不知道。不过,他们似乎很自然地从用例中得出结论——你遇到了一个红色区域=你访问了一个你不应该访问的地址。您总是可以阅读代码,例如。 asan_internal.h defines the kAsanHeapLeftRedzoneMagic value, and asan_allocator.cpp 用它毒害影子字节。

  4. What is a "Shadow byte/address" in this context?

    为了完整起见,影子字节是 影子 一组八个通常可访问的程序字节并跟踪一些对消毒程序有用的信息的字节。

    影子地址是影子字节的地址。