Pybind11:创建数据的 numpy 视图

Pybind11: Create numpy view of data

我想在 C++ 中创建数据的 numpy 视图 class。

但下面创建的是副本而不是视图。

python 测试:

import _cpp
a = _cpp.A()
print(a)
a.view()[:] = 100  # should make it all 100.
print(a)

结果:

40028064 0 0 0  // Fail: Modifying a.mutable_data() in C++ doesn't 
                //       change _data[4]
40028064 0 0 0  // Fail: Modifying a.view() in Python 3 doesn't 
                //       change data in a

C++ 行 a.mutable_data()[0] = -100; 不会将第 0 个元素更改为 -100。这表明 py::array_t<int> a(4, &_data[0]); 创建了一个副本而不是 int _data[4];

的视图

修改数组 a.view() 不会将 a 中的数据更改为 100s。 这表明 a.view()a.

中数据的副本而不是视图

main.cpp:

#include <iostream>
#include "pybind11/pybind11.h"
#include "pybind11/numpy.h"

namespace py = pybind11;
class A {
public:
    A() {}
    std::string str() {
        std::stringstream o;
        for (int i = 0; i < 4; ++i) o << _data[i] << " ";
        return o.str();
    }
    py::array view() {
        py::array_t<int> a(4, &_data[0]);
        a.mutable_data()[0] = -100;
        return a;
    }
    int _data[4];
};

PYBIND11_MODULE(_cpp, m) {
    py::class_<A>(m, "A")
        .def(py::init<>())
        .def("__str__", &A::str)
        .def("view", &A::view, py::return_value_policy::automatic_reference);
}

CMakeLists.txt:

cmake_minimum_required(VERSION 3.9)
project(test_pybind11)

set(CMAKE_CXX_STANDARD 11)

# Find packages.
set(PYTHON_VERSION 3)
find_package( PythonInterp ${PYTHON_VERSION} REQUIRED )
find_package( PythonLibs ${PYTHON_VERSION} REQUIRED )

# Download pybind11
set(pybind11_url https://github.com/pybind/pybind11/archive/stable.zip)

set(downloaded_file ${CMAKE_BINARY_DIR}/pybind11-stable.zip)
file(DOWNLOAD ${pybind11_url} ${downloaded_file})
execute_process(COMMAND ${CMAKE_COMMAND} -E tar xzf ${downloaded_file}
        SHOW_PROGRESS)
file(REMOVE ${downloaded_file})

set(pybind11_dir ${CMAKE_BINARY_DIR}/pybind11-stable)
add_subdirectory(${pybind11_dir})
include_directories(${pybind11_dir}/include)

# Make python module
pybind11_add_module(_cpp main.cpp)

根据 issue 308 中的 py::cast(self) 评论,我尝试 py::cast(*this)

行得通。我对视图的失效感到有点不安,但 numpy 也是这样做的。

Python 测试:

import _cpp
import numpy as np
a = _cpp.A()
print(a)
a.view()[:] = 100  # should make it all 100.
print(a)

测试结果:

1480305816 32581 19420784 0 // original data of `a`
100 100 100 100 // It works: changing `a.view()` changes data of `a`.

main.cpp:

#include <iostream>
#include "pybind11/pybind11.h"
#include "pybind11/numpy.h"

namespace py = pybind11;
class A {
public:
    A() {}
    std::string str() {
        std::stringstream o;
        for (int i = 0; i < 4; ++i) o << _data[i] << " ";
        return o.str();
    }
    py::array view() {
        return py::array(4, _data, py::cast(*this));  // <---
    }
    int _data[4];
};

PYBIND11_MODULE(_cpp, m) {
    py::class_<A>(m, "A")
        .def(py::init<>())
        .def("__str__", &A::str)
        .def("view", &A::view, py::return_value_policy::reference_internal);
}

我使用 reference_internal 使 a.view() 的生命周期与 a 的生命周期相关联。


视图在删除父对象后失效。

在python测试中删除a后,python将无限期地垃圾收集a的数据。这意味着如果我之前通过 b = a.view() 存储视图, b 在删除 a 后无效。

我尝试在 C++ 端创建 a._data 一个 numpy 数组,但这对无效化没有帮助。

main.cpp:

class A {
public:
    A() : _data(4, new int[4]) {}
    std::string str() {
        std::stringstream o;
        for (int i = 0; i < 4; ++i) o << _data.data()[i] << " ";
        return o.str();
    }
    py::array view() {
        return py::array(4, _data.data(), py::cast(*this));
    }
    py::array_t<int> _data;
};

Python 测试:

import _cpp
import numpy as np
a = _cpp.A()
print(a)
a.view()[:] = 100  # should make it all 100.
b = a.view()
print('b is base?', b.base is None)
del a
print('b is base after deleting a?', b.base is None)

c = np.zeros(4)
print('c is base?', c.base is None)
d = c.view()
print('d is base?', d.base is None)
del c
print('d is base after deleting c?', d.base is None)

结果:

-6886248 32554 16092080 0 
// c++ code's management of views
b is base? False
b is base after deleting a? False
// numpy's management of views
c is base? True
d is base? False
d is base after deleting c? False

看起来当基本 numpy 数组被删除时,内存的所有权没有转移到其中一个视图。 C++ class 也是如此。我想我会坚持以前的解决方案。