Python 扩展在处理大型列表时创建无效指针
Python extension creates invalid pointers when manipulating large lists
我设法为 python 列表实现了 Fisher–Yates 随机播放功能,作为习惯扩展 python 的练习。它非常适用于相对较小的列表,除非我 运行 多次使用该函数。
每当列表大小超过 100 时,我就会遇到各种内存问题:
>>>import evosutil
>>> a=[i for i in range(100)]
>>> evosutil.shuffle(a)
>>> a
[52, 66, 0, 58, 41, 18, 50, 37, 81, 43, 74, 49, 90, 20, 63, 32, 89, 60, 2, 44, 3, 80, 15, 24, 22, 69, 86, 31, 56, 68, 34, 13, 38, 26, 14, 91, 73, 79, 39, 65, 5, 75, 84, 55, 7, 53, 93, 42, 40, 9, 51, 82, 29, 30, 99, 64, 33, 97, 27, 11, 6, 67, 16, 94, 95, 62, 57, 17, 78, 77, 71, 98, 72, 8, 88, 36, 85, 59, 21, 96, 23, 46, 10, 12, 48, 83, 4, 92, 45, 54, 1, 25, 19, 70, 35, 61, 47, 28, 87, 76]
>>> (Ctrl-D)
*** Error in `python3': free(): invalid next size (fast): 0x083fe680 ***
或者,当尝试对包含 1000 个元素的列表进行操作时:
*** Error in `python3': munmap_chunk(): invalid pointer: 0x083ff0e0 ***
或者,
Segmentation fault (core dumped)
这是产生错误的模块的代码:
inline void _List_SwapItems(PyObject* list, Py_ssize_t i1, Py_ssize_t i2){
PyObject* tmp=PyList_GetItem(list, i2);
PyList_SetItem(list, i2, PyList_GetItem(list, i1));
PyList_SetItem(list, i1, tmp);
}
//Naive Fisher–Yates shuffle
static PyObject* shuffle(PyObject* self, PyObject* args){
PyObject* list;
PyArg_ParseTuple(args,"O", &list);
unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
std::minstd_rand0 rand(seed);
Py_ssize_t size = PyList_Size(list);
for(int i=0; i<size;++i){
int randIndex = rand()%size;
_List_SwapItems(list, randIndex, i);
}
Py_RETURN_NONE;
}
我觉得我应该可以使用 free() 或 Py_DECREF() 在某处解决这个问题,但我不知道在哪里。我不认为我正在创建任何对象,只是移动它们。那么内存问题是从哪里来的呢?
在将它们传递给 PyList_SetItem()
之前,您需要 Py_XINCREF()
两个对象。此外,捕捉 i1 == i2
:
的特殊情况
inline void _List_SwapItems(PyObject* list, Py_ssize_t i1, Py_ssize_t i2){
if (i1 == i2) {
return;
}
PyObject* obj1=PyList_GetItem(list, i1);
PyObject* obj2=PyList_GetItem(list, i2);
Py_XINCREF(obj1);
Py_XINCREF(obj2);
PyList_SetItem(list, i2, obj1);
PyList_SetItem(list, i1, obj2);
}
PyList_GetItem()
returns 借用的引用,即它 INCREF
不是它 returns 的对象。如果您没有任何其他引用,则引用计数将为 1
(因为它仅从列表中引用)。当您调用 PyList_SetItem(list, i2, ...)
时,列表 Py_XDECREF()
是先前存储在 i2
的对象(您保留在 tmp
中)。此时,引用计数达到 0
并且对象被释放。哎呀。
同样,您不能只调用 PyList_SetItem(list, i, PyList_GetItem())
,因为 SetItem
窃取了您传递给它的引用。您不拥有引用,但是 'old' 列表拥有。所以你在这里也需要一个Py_XINCREF
。
有关详细信息,请参阅 list API documentation。
作为进一步的建议,您可以考虑不直接针对 Python 扩展 API 进行编程。完成任何事情都需要大量代码,而且要保持引用计数正确也非常困难。到目前为止,还有多种其他方法可以将 Python 与 C 或 C++ 交互。 CFFI seems to be the low-level interface the Python ecosystem will standardize on. SIP and SWIG may offer better support for C++, though. For a SIP example, see this answer.
除了引用计数错误之外,您的扩展函数中还有更多问题,更多问题如下:
虽然具有适当引用计数的 PyList_SetItem
是首选方法,但(丑陋的)选项是使用 PyList_SET_ITEM
宏,它可以避免执行 INCREF:
void PyList_SET_ITEM(PyObject *list, Py_ssize_t i, PyObject *o)
Macro form of PyList_SetItem()
without error checking. This is normally only used to fill in new lists where there is no previous
content.
Note
This macro “steals” a reference to item, and, unlike PyList_SetItem()
, does not discard a reference to any item that is
being replaced; any reference in list at position i
will be leaked.
因此 PyList_SET_ITEM
既不递增也不递减任何引用计数器,这很适合我们,因为最初和最终的元素都在同一个列表中。
inline void _List_SwapItems(PyObject* list, Py_ssize_t i1, Py_ssize_t i2){
PyObject* tmp = PyList_GET_ITEM(list, i2);
PyList_SET_ITEM(list, i2, PyList_GET_ITEM(list, i1));
PyList_SET_ITEM(list, i1, tmp);
}
请注意,这根本不会进行任何错误检查,因此您需要确保您的索引在范围内(for
循环负责)。
您的代码还有另一个尚未讨论的严重问题 - 完全缺乏错误检查。例如,当传入一个非列表对象时,你应该引发一个 TypeError
。现在代码将在 PyList_Size
、returning -1 处失败并设置内部异常,这可能导致所有未来 C 扩展的错误行为:
同样 PyArg_ParseTuple
可以并且 如果传入的参数数量不正确 将会失败,因此您必须检查它的 return 值;在这种情况下,list
可以未初始化,您的代码将具有完全未定义的行为。
C-API 文档 states the following:
When a function must fail because some function it called failed, it
generally doesn’t set the error indicator; the function it called
already set it. It is responsible for either handling the error and
clearing the exception or returning after cleaning up any resources it
holds (such as object references or memory allocations); it should not
continue normally if it is not prepared to handle the error. If
returning due to an error, it is important to indicate to the caller
that an error has been set. If the error is not handled or carefully
propagated, additional calls into the Python/C API may not behave as
intended and may fail in mysterious ways.
因此,这是编写扩展函数的正确方法:
static PyObject* shuffle(PyObject* self, PyObject* args){
PyObject* list;
if (! PyArg_ParseTuple(args, "O", &list)) {
// PyArg_ParseTuple set the proper exception
return NULL;
}
if (! PyList_Check(list)) {
PyErr_SetString(PyExc_TypeError,
"bad argument to shuffle; list expected");
return NULL;
}
unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
std::minstd_rand0 rand(seed);
Py_ssize_t size = PyList_Size(list);
for(int i=0; i<size;++i){
int randIndex = rand()%size;
_List_SwapItems(list, randIndex, i);
}
Py_RETURN_NONE;
}
我设法为 python 列表实现了 Fisher–Yates 随机播放功能,作为习惯扩展 python 的练习。它非常适用于相对较小的列表,除非我 运行 多次使用该函数。
每当列表大小超过 100 时,我就会遇到各种内存问题:
>>>import evosutil
>>> a=[i for i in range(100)]
>>> evosutil.shuffle(a)
>>> a
[52, 66, 0, 58, 41, 18, 50, 37, 81, 43, 74, 49, 90, 20, 63, 32, 89, 60, 2, 44, 3, 80, 15, 24, 22, 69, 86, 31, 56, 68, 34, 13, 38, 26, 14, 91, 73, 79, 39, 65, 5, 75, 84, 55, 7, 53, 93, 42, 40, 9, 51, 82, 29, 30, 99, 64, 33, 97, 27, 11, 6, 67, 16, 94, 95, 62, 57, 17, 78, 77, 71, 98, 72, 8, 88, 36, 85, 59, 21, 96, 23, 46, 10, 12, 48, 83, 4, 92, 45, 54, 1, 25, 19, 70, 35, 61, 47, 28, 87, 76]
>>> (Ctrl-D)
*** Error in `python3': free(): invalid next size (fast): 0x083fe680 ***
或者,当尝试对包含 1000 个元素的列表进行操作时:
*** Error in `python3': munmap_chunk(): invalid pointer: 0x083ff0e0 ***
或者,
Segmentation fault (core dumped)
这是产生错误的模块的代码:
inline void _List_SwapItems(PyObject* list, Py_ssize_t i1, Py_ssize_t i2){
PyObject* tmp=PyList_GetItem(list, i2);
PyList_SetItem(list, i2, PyList_GetItem(list, i1));
PyList_SetItem(list, i1, tmp);
}
//Naive Fisher–Yates shuffle
static PyObject* shuffle(PyObject* self, PyObject* args){
PyObject* list;
PyArg_ParseTuple(args,"O", &list);
unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
std::minstd_rand0 rand(seed);
Py_ssize_t size = PyList_Size(list);
for(int i=0; i<size;++i){
int randIndex = rand()%size;
_List_SwapItems(list, randIndex, i);
}
Py_RETURN_NONE;
}
我觉得我应该可以使用 free() 或 Py_DECREF() 在某处解决这个问题,但我不知道在哪里。我不认为我正在创建任何对象,只是移动它们。那么内存问题是从哪里来的呢?
在将它们传递给 PyList_SetItem()
之前,您需要 Py_XINCREF()
两个对象。此外,捕捉 i1 == i2
:
inline void _List_SwapItems(PyObject* list, Py_ssize_t i1, Py_ssize_t i2){
if (i1 == i2) {
return;
}
PyObject* obj1=PyList_GetItem(list, i1);
PyObject* obj2=PyList_GetItem(list, i2);
Py_XINCREF(obj1);
Py_XINCREF(obj2);
PyList_SetItem(list, i2, obj1);
PyList_SetItem(list, i1, obj2);
}
PyList_GetItem()
returns 借用的引用,即它 INCREF
不是它 returns 的对象。如果您没有任何其他引用,则引用计数将为 1
(因为它仅从列表中引用)。当您调用 PyList_SetItem(list, i2, ...)
时,列表 Py_XDECREF()
是先前存储在 i2
的对象(您保留在 tmp
中)。此时,引用计数达到 0
并且对象被释放。哎呀。
同样,您不能只调用 PyList_SetItem(list, i, PyList_GetItem())
,因为 SetItem
窃取了您传递给它的引用。您不拥有引用,但是 'old' 列表拥有。所以你在这里也需要一个Py_XINCREF
。
有关详细信息,请参阅 list API documentation。
作为进一步的建议,您可以考虑不直接针对 Python 扩展 API 进行编程。完成任何事情都需要大量代码,而且要保持引用计数正确也非常困难。到目前为止,还有多种其他方法可以将 Python 与 C 或 C++ 交互。 CFFI seems to be the low-level interface the Python ecosystem will standardize on. SIP and SWIG may offer better support for C++, though. For a SIP example, see this answer.
除了引用计数错误之外,您的扩展函数中还有更多问题,更多问题如下:
虽然具有适当引用计数的 PyList_SetItem
是首选方法,但(丑陋的)选项是使用 PyList_SET_ITEM
宏,它可以避免执行 INCREF:
void PyList_SET_ITEM(PyObject *list, Py_ssize_t i, PyObject *o)
Macro form of
PyList_SetItem()
without error checking. This is normally only used to fill in new lists where there is no previous content.Note
This macro “steals” a reference to item, and, unlike
PyList_SetItem()
, does not discard a reference to any item that is being replaced; any reference in list at positioni
will be leaked.
因此 PyList_SET_ITEM
既不递增也不递减任何引用计数器,这很适合我们,因为最初和最终的元素都在同一个列表中。
inline void _List_SwapItems(PyObject* list, Py_ssize_t i1, Py_ssize_t i2){
PyObject* tmp = PyList_GET_ITEM(list, i2);
PyList_SET_ITEM(list, i2, PyList_GET_ITEM(list, i1));
PyList_SET_ITEM(list, i1, tmp);
}
请注意,这根本不会进行任何错误检查,因此您需要确保您的索引在范围内(for
循环负责)。
您的代码还有另一个尚未讨论的严重问题 - 完全缺乏错误检查。例如,当传入一个非列表对象时,你应该引发一个 TypeError
。现在代码将在 PyList_Size
、returning -1 处失败并设置内部异常,这可能导致所有未来 C 扩展的错误行为:
同样 PyArg_ParseTuple
可以并且 如果传入的参数数量不正确 将会失败,因此您必须检查它的 return 值;在这种情况下,list
可以未初始化,您的代码将具有完全未定义的行为。
C-API 文档 states the following:
When a function must fail because some function it called failed, it generally doesn’t set the error indicator; the function it called already set it. It is responsible for either handling the error and clearing the exception or returning after cleaning up any resources it holds (such as object references or memory allocations); it should not continue normally if it is not prepared to handle the error. If returning due to an error, it is important to indicate to the caller that an error has been set. If the error is not handled or carefully propagated, additional calls into the Python/C API may not behave as intended and may fail in mysterious ways.
因此,这是编写扩展函数的正确方法:
static PyObject* shuffle(PyObject* self, PyObject* args){
PyObject* list;
if (! PyArg_ParseTuple(args, "O", &list)) {
// PyArg_ParseTuple set the proper exception
return NULL;
}
if (! PyList_Check(list)) {
PyErr_SetString(PyExc_TypeError,
"bad argument to shuffle; list expected");
return NULL;
}
unsigned seed = std::chrono::system_clock::now().time_since_epoch().count();
std::minstd_rand0 rand(seed);
Py_ssize_t size = PyList_Size(list);
for(int i=0; i<size;++i){
int randIndex = rand()%size;
_List_SwapItems(list, randIndex, i);
}
Py_RETURN_NONE;
}