在Visual Studio中编写的MexGateway代码中是否可以将变量预分配到CPU/GPU内存？

Question

我正在尝试编写 MexGateway 代码以将 matlab 中的两个变量传递给编译后的 MexFile，将变量复制到 cuda 内核，进行处理并将结果返回给 Matlab。我需要在 matlab 的 for 循环中使用这个 MexFile。

问题是：这两个输入对我的应用程序来说很大，并且只有其中一个（在下面的代码中称为 Device_Data）在每个循环中发生变化。因此，我正在寻找一种预分配稳定输入的方法，以便它不会在我的 for 循环的每次迭代中从 GPU 中移除。我还需要说明，我确实需要在我的 visual studio 代码中执行此操作，并在 MexGateway 代码中实现（我不想在 Matlab 中执行此操作）。有解决办法吗？

这是我的代码（我已经编译过了，运行良好）：

#include <cuda_runtime.h>
#include "device_launch_parameters.h"
#include <stdio.h>
#include "cuda.h"
#include <iostream>
#include <mex.h>
#include "MexFunctions.cuh"




__global__ void add (int* Device_Data, int* Device_MediumX, int N) {
int TID = threadIdx.y * blockDim.x + threadIdx.x;
if (TID < N) {
    for (int i = 0; i < N; i++) {
        Device_Data[i] = Device_Data[i] + Device_MediumX[i];
    }
}
}
void mexFunction(int nlhs, mxArray* plhs[],
int nrhs, const mxArray* prhs[]) {

int N = 128;
int* MediumX;
int* Data;
int* Data_New;

MediumX = (int*)mxGetPr(prhs[0]);
Data = (int*)mxGetPr(prhs[1]);

plhs[0] = mxCreateNumericMatrix(N,1, mxINT32_CLASS, mxREAL);
Data_New = (int*)mxGetData(plhs[0]);


int ArrayByteSize = sizeof(int) * N;
int* Device_MediumX; // device pointer to the X coordinates of the medium
gpuErrchk(cudaMalloc((int**)&Device_MediumX, ArrayByteSize));
gpuErrchk(cudaMemcpy(Device_MediumX, MediumX, ArrayByteSize, cudaMemcpyHostToDevice));

int* Device_Data; // device pointer to the X coordinates of the medium
gpuErrchk(cudaMalloc((int**)&Device_Data, ArrayByteSize));
gpuErrchk(cudaMemcpy(Device_Data, Data, ArrayByteSize, cudaMemcpyHostToDevice));

dim3 block(N, 1);
dim3 grid(1);//SystemSetup.NumberOfTransmitter
add << <grid, block >> > (Device_Data, Device_MediumX, N);

(cudaMemcpy(Data_New, Device_Data, ArrayByteSize, cudaMemcpyDeviceToHost));


cudaDeviceReset();

}

Answer 1

可以，只要你有MATLAB的分布式计算Toolbox/Parallel计算工具箱

工具箱允许在普通 MATLAB 代码中有一个叫做 gpuArrays 的东西，但它也有一个 C 接口，您可以在其中获取和设置这些 MATLAB 数组 GPU 地址。

您可以在此处找到文档：

https://uk.mathworks.com/help/parallel-computing/gpu-cuda-and-mex-programming.html?s_tid=CRUX_lftnav

例如，对于 mex 文件的第一个输入：

mxGPUArray const *dataHandler= mxGPUCreateFromMxArray(prhs[0]); // Can be CPU or GPU, will copy to GPU if its not already there
float  *  d_data = static_cast<float  *>( (float *)mxGPUGetDataReadOnly(dataHandler)); // get the pointer itself (assuming float data)

在Visual Studio中编写的MexGateway代码中是否可以将变量预分配到CPU/GPU内存？

Is it possible to pre-allocate a variable to CPU/GPU memory in the MexGateway code written in Visual Studio?

memory

cuda

mex