C++ 垃圾回收

C++ garbage collection

有许多 C++ 垃圾收集库。

我对指针跟踪的工作原理有点困惑。

特别是,假设我们有一个基指针 P 和一个其他指针列表,这些指针是使用数组计算为 P 的偏移量。

例如,

P2 = P+偏移量[0]

垃圾收集器如何知道 P2 仍在范围内?它没有直接参考,但仍然可以访问。

可能最流行的 C++ gc 是

https://en.m.wikipedia.org/wiki/Boehm_garbage_collector

但是按照他们的示例语法,似乎很容易破解,所以我一定是没理解什么。

这个问题不能一概而论。有不同的系统可能被视为 C++ 的垃圾收集;例如,Herb Sutter's deferred_ptr is basically a garbage collecting smart pointer. I've personally implemented another version of this idea,类似于 Sutter 的,但没那么花哨。

不过,我可以回答有关 Boehm 的问题。 Boehm 垃圾收集器在执行“标记”阶段时如何识别指针,基本上是通过扫描内存并假设看起来像指针的东西是指针。

垃圾收集器知道用户数据所在的所有内存区域,它知道它已分配的所有指针以及这些分配的大小。它只是查找从如下定义的“根段”开始的指针链,其中“查找”的意思是明确扫描内存以查找与它已完成的 GC 分配之一相同的 64 位值。

来自here

Since it cannot generally tell where pointer variables are located, it scans the following root segments for pointers:

  • The registers. Depending on the architecture, this may be done using assembly code, or by calling a setjmp-like function which saves register contents on the stack.
  • The stack(s). In the case of a single-threaded application, on most platforms this is done by scanning the memory between (an
    approximation of) the current stack pointer and GC_stackbottom. (For
    Itanium, the register stack scanned separately.) The GC_stackbottom
    variable is set in a highly platform-specific way depending on the
    appropriate configuration information in gcconfig.h. Note that the
    currently active stack needs to be scanned carefully, since
    callee-save registers of client code may appear inside collector
    stack frames, which may change during the mark process. This is
    addressed by scanning some sections of the stack "eagerly",
    effectively capturing a snapshot at one point in time.
  • Static data region(s). In the simplest case, this is the region between DATASTART and DATAEND, as defined in gcconfig.h. However, in
    most cases, this will also involve static data regions associated
    with dynamic libraries. These are identified by the mostly
    platform-specific code in dyn_load.c.

64 位指针的地址 space 很大,因此误报很少见,但即使发生,误报也只是泄漏,只要碰巧有其他指针就会持续内存中的变量标记阶段扫描与垃圾收集器分配的某个 64 位指针完全相同的值。