SWIG 中的垃圾收集和自定义 getter

Question

我不是作者，但我使用的 public 软件包似乎正在泄漏内存 (Github issue)。我想弄清楚如何修补它以使其正常工作。

为了缩小问题范围，有一个结构，称之为 xxx_t。首先 %extend 用于使结构的成员在 Python:

中可用

%extend xxx_t {
    char *surface;
}

还有一个习俗getter。它在这里究竟做了什么并不重要，除了它使用 new 来创建一个 char*。

%{
char* xxx_t_surface_get(xxx *n) {
  char *s = new char [n->length + 1];
  memcpy (s, n->surface, n->length);
  s[n->length] = '[=12=]';
  return s;
}
%}

目前代码中有这一行来处理垃圾回收：

%newobject surface;

这似乎没有按预期工作。 %newobject xxx_t::surface; 也不行。如果我用 %newobject xxx_t_surface_get; 替换它，那将不起作用，因为 getter 函数被转义（在 %{ ... %} 内）。

告诉 SWIG 关于 char* 以便释放它的正确方法是什么？

Answer 1

在开始之前值得指出一件事：因为你 return char* 它最终使用 SWIG 的普通字符串类型映射来生成 Python 字符串。

话虽如此，让我们了解一下当前生成的代码是什么样的。我们可以使用以下 SWIG 接口定义开始我们的调查以进行试验：

%module test 

%inline %{
  struct foobar {
  };
%}

%extend foobar {
  char *surface;
}

如果我们通过 SWIG 运行像这样的东西，我们将看到一个生成的函数，它包装了你的 _surface_get 代码，像这样：

SWIGINTERN PyObject *_wrap_foobar_surface_get(PyObject *SWIGUNUSEDPARM(self), PyObject *args) {
  PyObject *resultobj = 0;
  foobar *arg1 = (foobar *) 0 ;
  void *argp1 = 0 ;
  int res1 = 0 ;
  PyObject * obj0 = 0 ;
  char *result = 0 ;

  if (!PyArg_ParseTuple(args,(char *)"O:foobar_surface_get",&obj0)) SWIG_fail;
  res1 = SWIG_ConvertPtr(obj0, &argp1,SWIGTYPE_p_foobar, 0 |  0 );
  if (!SWIG_IsOK(res1)) {
    SWIG_exception_fail(SWIG_ArgError(res1), "in method '" "foobar_surface_get" "', argument " "1"" of type '" "foobar *""'"); 
  }
  arg1 = reinterpret_cast< foobar * >(argp1);
  result = (char *)foobar_surface_get(arg1);
  resultobj = SWIG_FromCharPtr((const char *)result);
  /* result is never used again from here onwards */
  return resultobj;
fail:
  return NULL;
}

但是这里要注意的是，当这个包装器 return 时，调用您的 getter 的结果会丢失。也就是说，它甚至与 returned.

的 Python 字符串对象的生命周期无关

因此我们可以通过多种方式解决此问题：

一个选项是确保生成的包装器在 SWIG_FromCharPtr 发生后根据调用 getter 的结果调用 delete[]。这正是 %newobject 在这种情况下所做的。（见下文）。
另一种选择是在调用之间保留分配的缓冲区，可能在某些线程本地存储中并跟踪大小以最小化分配
或者，我们可以使用某种基于 RAII 的对象来拥有临时缓冲区并确保它被删除。（如果我们愿意，我们可以用 operator void* 做一些巧妙的事情）。

如果我们更改界面以添加 %newobject，如下所示：

%module test 

%inline %{
  struct foobar {
  };
%}

%newobject surface;

%extend foobar {
  char *surface;
}

然后我们看到我们生成的代码现在看起来像这样：

  // ....
  result = (char *)foobar_surface_get(arg1);
  resultobj = SWIG_FromCharPtr((const char *)result);
  delete[] result;

我们也可以在 github 的真实代码中看到这一点，所以这不是您要查找的错误。

通常对于 C++，我倾向于使用 RAII 选项。碰巧从 SWIG 和 C++ 的角度来看，有一种巧妙的方法可以做到这一点：std::string。因此，我们可以通过执行以下操作以简单干净的方式修复泄漏：

%include <std_string.i> /* If you don't already have this... */

%extend xxx_t {
    std::string surface;
}

%{
std::string xxx_t_surface_get(xxx *n) {
  return std::string(n->surface, n->length);
}
%}

（你也需要改变 setter 来匹配，除非你把它设为 const 所以没有 setter）

尽管如此，它仍在为同一输出进行两组分配。首先，std::string 对象进行一次分配，然后对 Python 字符串对象进行一次分配。这就是缓冲区已经存在于 C++ 中的所有内容。因此，虽然此更改足以解决泄漏问题并且是正确的，但您还可以更进一步，编写一个减少复制的版本：

%extend xxx_t {
    PyObject *surface;
}

%{
PyObject *xxx_t_surface_get(xxx *n) {
  return SWIG_FromCharPtrAndSize(n->surface, n->length);
}
%}

SWIG 中的垃圾收集和自定义 getter

Garbage collection and custom getter in SWIG

python

swig