如何有效地(不循环)从 C++ 中的 torchscript 预测的张量中获取数据?
How to efficiently (without looping) get data from tensor predicted by a torchscript in C++?
我正在从 C++ 程序调用 torchscript(从 Python 序列化的神经网络):
// define inputs
int batch = 3; // batch size
int n_inp = 2; // number of inputs
double I[batch][n_inp] = {{1.0, 1.0}, {2.0, 3.0}, {4.0, 5.0}}; // some random input
std::cout << "inputs" "\n"; // print inputs
for (int i = 0; i < batch; ++i)
{
std::cout << "\n";
for (int j = 0; j < n_inp; ++j)
{
std::cout << I[i][j] << "\n";
}
}
// prepare inputs for feeding to neural network
std::vector<torch::jit::IValue> inputs;
inputs.push_back(torch::from_blob(I, {batch, n_inp}, at::kDouble));
// deserialize and load scriptmodule
torch::jit::script::Module module;
module = torch::jit::load("Net-0.pt");
// do forward pass
auto outputs = module.forward(inputs).toTensor();
通常,要从输出中获取数据,会执行以下(逐元素)操作:
// get data from outputs
std::cout << "outputs" << "\n";
int n_out = 1;
double outputs_data[batch][n_out];
for (int i = 0; i < batch; i++)
{
for (int j = 0; j < n_out; j++)
{
outputs_data[i][j] = outputs[i][j].item<double>();
std::cout << outputs_data[i][j] << "\n";
}
}
但是,使用 .item
的这种循环效率非常低(在实际代码中,我将在每个时间步预测数百万个点)。我想直接从 outputs
获取数据(不遍历元素)。我试过了:
int n_out = 1;
double outputs_data[batch][n_out];
outputs_data = outputs.data_ptr<double>();
但是,它给出了错误:
error: incompatible types in assignment of ‘double*’ to ‘double [batch][n_out]’
outputs_data = outputs.data_ptr<double>();
^
请注意,outputs_data
的类型固定为double
,无法更改。
需要深拷贝如下:
double outputs_data[batch];
std::memcpy(outputs_data, outputs.data_ptr<dfloat>(), sizeof(double)*batch);
我正在从 C++ 程序调用 torchscript(从 Python 序列化的神经网络):
// define inputs
int batch = 3; // batch size
int n_inp = 2; // number of inputs
double I[batch][n_inp] = {{1.0, 1.0}, {2.0, 3.0}, {4.0, 5.0}}; // some random input
std::cout << "inputs" "\n"; // print inputs
for (int i = 0; i < batch; ++i)
{
std::cout << "\n";
for (int j = 0; j < n_inp; ++j)
{
std::cout << I[i][j] << "\n";
}
}
// prepare inputs for feeding to neural network
std::vector<torch::jit::IValue> inputs;
inputs.push_back(torch::from_blob(I, {batch, n_inp}, at::kDouble));
// deserialize and load scriptmodule
torch::jit::script::Module module;
module = torch::jit::load("Net-0.pt");
// do forward pass
auto outputs = module.forward(inputs).toTensor();
通常,要从输出中获取数据,会执行以下(逐元素)操作:
// get data from outputs
std::cout << "outputs" << "\n";
int n_out = 1;
double outputs_data[batch][n_out];
for (int i = 0; i < batch; i++)
{
for (int j = 0; j < n_out; j++)
{
outputs_data[i][j] = outputs[i][j].item<double>();
std::cout << outputs_data[i][j] << "\n";
}
}
但是,使用 .item
的这种循环效率非常低(在实际代码中,我将在每个时间步预测数百万个点)。我想直接从 outputs
获取数据(不遍历元素)。我试过了:
int n_out = 1;
double outputs_data[batch][n_out];
outputs_data = outputs.data_ptr<double>();
但是,它给出了错误:
error: incompatible types in assignment of ‘double*’ to ‘double [batch][n_out]’
outputs_data = outputs.data_ptr<double>();
^
请注意,outputs_data
的类型固定为double
,无法更改。
需要深拷贝如下:
double outputs_data[batch];
std::memcpy(outputs_data, outputs.data_ptr<dfloat>(), sizeof(double)*batch);