使用 Python3/C API 更新数组元素

Question

我有一个模块方法，它接受一个 python 列表，然后输出所有项目乘以 100 的相同列表。

我已尝试尽可能地关注 the C intro here，但仍然运行遇到问题。

static PyObject *
test_update_list(PyObject *self, PyObject *args)
{
    PyObject *listObj = NULL;
    PyObject *item = NULL;
    PyObject *mult = PyLong_FromLong(100);
    PyObject *incremented_item = NULL;

    if (!PyArg_ParseTuple(args, "O", &listObj))
    {
        return NULL;
    }

    /* get the number of lines passed to us */
    Py_ssize_t numLines = PyList_Size(listObj);

    /* should raise an error here. */
    if (numLines < 0) return NULL; /* Not a list */

    for (Py_ssize_t i=0; i<numLines; i++) {
        // pick the item 
        item = PyList_GetItem(listObj, i);

        if (mult == NULL)
            goto error;

        // increment it
        incremented_item = PyNumber_Add(item, mult);

        if (incremented_item == NULL)
            goto error;

        // update the list item
        if (PyObject_SetItem(listObj, i, incremented_item) < 0)
            goto error;

    }
error:
    Py_XDECREF(item);
    Py_XDECREF(mult);
    Py_XDECREF(incremented_item);
    return listObj;
};

以上符合要求，但是当我在 ipython 中运行时，出现以下错误。

如果取消错误处理，我会遇到段错误。

---------------------------------------------------------------------------
SystemError                               Traceback (most recent call last)
SystemError: null argument to internal routine

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
<ipython-input-3-da275aa3369f> in <module>()
----> 1 testadd.test_update_list([1,2,3])

SystemError: <built-in function ulist> returned a result with an error set

感谢任何帮助。

Answer 1

所以你有很多问题都需要改正。我已将它们全部列在不同的标题下，因此您可以一次浏览它们。

总是 returning `listObj`

当您在 for 循环中遇到错误时，您会 goto 错误标签，该标签仍在 return 列表中。通过 returning 这个列表，你隐藏了你的函数中有一个错误。当您希望您的函数引发异常时，您必须始终 return NULL。

不增加 `listObj` 引用计数 return

当您的函数被调用时，您将获得对参数的借用引用。当您 return 这些参数之一时，您正在创建对列表的新引用，因此必须增加其引用计数。否则，解释器的引用计数将比 object 的实际引用数少 1。这将导致一个错误，当只有 1 个引用而不是 0 时，解释器会释放您的列表！这可能会导致段错误，或者在最坏的情况下可能会导致程序的随机部分访问已经被释放并分配给其他 object.

将 `PyObject_SetItem` 与原语结合使用

PyObject_SetItem can be used with dicts and other class that implements obj[key] = val. So you cannot supply it with an argument of type Py_ssize_t. Instead, use PyList_SetItem 只接受 Py_ssize_t 作为其索引参数。

`item` 和 `incremented_item`

的错误内存处理

PyObject_SetItem 和 PyList_SetItem 都处理减少已经在设置位置的 object 的引用计数。所以我们不需要担心管理 item 的引用计数，因为我们只使用列表中的引用 borrowed。这对函数还窃取了对 incremented_item 的引用，因此我们也不必担心管理其引用计数。

不正确的参数导致内存泄漏

例如，当您使用 int 调用函数时。您将创建一个对 100 int object 的新引用，但是因为您 return NULL 而不是 goto error，这个引用将丢失。因此，您需要以不同方式处理此类情况。在我的解决方案中，我将 PyLong_FromLong 调用移至 arg 和类型检查之后。通过这种方式，我们只创建这个新* object 一旦我们保证它会被使用。

工作代码

旁注：我删除了 goto 语句，因为只剩下一个，所以在那个时候进行错误处理比在以后进行更有意义。

static PyObject *
testadd_update_list(PyObject *self, PyObject *args)
{
    PyObject *listObj = NULL;
    PyObject *item = NULL;
    PyObject *mult = NULL;
    PyObject *incremented_item = NULL;
    Py_ssize_t numLines;

    if (!PyArg_ParseTuple(args, "O:update_list", &listObj))
    {
        return NULL;
    }
    if (!PyList_Check(listObj)) {
        PyErr_BadArgument();
        return NULL;
    }

    /* get the number of lines passed to us */
    // Don't want to rely on the error checking of this function as it gives a weird stack trace. 
    // Instead, we use Py_ListCheck() and PyErr_BadArgument() as above. Since list is definitely 
    // a list now, then PyList_Size will never throw an error, and so we could use 
    // PyList_GET_SIZE(listObj) instead.
    numLines = PyList_Size(listObj);

    // only initialise mult here, otherwise the above returns would create a memory leak
    mult = PyLong_FromLong(100);
    if (mult == NULL) {
        return NULL;
    }

    for (Py_ssize_t i=0; i<numLines; i++) {
        // pick the item 
        // It is possible for this line to raise an error, but our invariants should
        // ensure no error is ever raised. `list` is always of type list and `i` is always 
        // in bounds.
        item = PyList_GetItem(listObj, i);

        // increment it, and check for type errors or memory errors
        incremented_item = PyNumber_Add(item, mult);
        if (incremented_item == NULL) {
            // ERROR!
            Py_DECREF(mult);
            return NULL;
        }

        // update the list item
        // We definitely have a list, and our index is in bounds, so we should never see an error 
        // here.
        PyList_SetItem(listObj, i, incremented_item);
        // PyList_SetItem steals our reference to incremented_item, and so we must be careful in 
        // how we handle incremented_item now. Either incremented_item will not be our 
        // responsibility any more or it is NULL. As such, we can just remove our Py_XDECREF call
    }

    // success!
    // We are returning a *new reference* to listObj. We must increment its ref count as a result!
    Py_INCREF(listObj);
    Py_DECREF(mult);
    return listObj;
}

脚注：

* PyLong_FromLong(100) 实际上并没有创建新的 object，而是 return 对现有 object 的新引用。具有低值的整数（0 <= i < 128 我认为）都被缓存，并且相同的 object 在需要时被 returned。这是一个实现细节，旨在避免为小值分配和取消分配高级整数，从而提高 Python.

的性能

使用 Python3/C API 更新数组元素

Updating elements of an array using the Python3/C API

python

python-c-api

总是 returning `listObj`

不增加 `listObj` 引用计数 return

将 `PyObject_SetItem` 与原语结合使用

`item` 和 `incremented_item`

不正确的参数导致内存泄漏

工作代码

脚注：

使用 Python3/C API 更新数组元素

Updating elements of an array using the Python3/C API

python

python-c-api

总是 returning listObj

不增加 listObj 引用计数 return

将 PyObject_SetItem 与原语结合使用

item 和 incremented_item

不正确的参数导致内存泄漏

工作代码

脚注：

总是 returning `listObj`

不增加 `listObj` 引用计数 return

将 `PyObject_SetItem` 与原语结合使用

`item` 和 `incremented_item`