CUDA图像处理错误
CUDA image processing error
我正在做一个小型图像处理项目。我想要 运行 一个进行图像减法的 CUDA 程序。所以你有图像背景和具有相同背景但上面有一些其他东西的图像。一旦你减去图像,你就会得到剩下的东西。两张图片都是480*360,我的gpu是GTX780。我的程序抛出错误 ./main': free(): invalid next size (normal): 0x000000000126bd70 ***
Aborted (core dumped)
并且输出图像错误。我一直在努力解决这个问题。这是代码:
内核:
__global__ void add(unsigned char* a, unsigned char* b, unsigned char* c, int numCols, int numWidth) {
int i = blockIdx.x * blockDim.x + threadIdx.x; //Column
int j = blockIdx.y * blockDim.y + threadIdx.y; //Row
if(i < numWidth && j < numCols)
{
int idx = j * numCols + i;
c[idx] = b[idx] - a[idx];
}
}
和主要功能:
int main() {
CImg<unsigned char> img1("1.bmp");
CImg<unsigned char> img2("2.bmp");
//both images have the same size
int width = img1.width();
int height = img1.height();
int size = width * height * 3; //both images of same size
dim3 blockSize(16, 16, 1);
dim3 gridSize((width + blockSize.x - 1) / blockSize.x, (height + blockSize.y - 1) / blockSize.y, 1);
unsigned char *dev_a, *dev_b, *dev_c;
cudaMalloc((void**)&dev_a, size * (sizeof(unsigned char)));
cudaMalloc((void**)&dev_b, size * (sizeof(unsigned char)));
cudaMalloc((void**)&dev_c, size * (sizeof(unsigned char)));
cudaMemcpy(dev_a, img1, size * (sizeof(unsigned char)), cudaMemcpyHostToDevice);
cudaMemcpy(dev_b, img2, size * (sizeof(unsigned char)), cudaMemcpyHostToDevice);
add<<<gridSize, blockSize>>>(dev_a, dev_b, dev_c, height, width);
cudaMemcpy(img2, dev_c, size * (sizeof(unsigned char)), cudaMemcpyDeviceToHost);
img2.save("out.bmp");
cudaFree(dev_a);
cudaFree(dev_b);
cudaFree(dev_c);
return 0;
}
图像加载了 CImg 库。
问题在于在主机代码中错误地使用了 cimg 容器。根据 documentation,图像数据指针是通过 data()
方法访问的,这意味着主机代码中的 cudaMemcpy
调用应提供 img1.data()
和 img2.data()
.
[此答案根据评论汇总并添加为社区 wiki 条目]
我正在做一个小型图像处理项目。我想要 运行 一个进行图像减法的 CUDA 程序。所以你有图像背景和具有相同背景但上面有一些其他东西的图像。一旦你减去图像,你就会得到剩下的东西。两张图片都是480*360,我的gpu是GTX780。我的程序抛出错误 ./main': free(): invalid next size (normal): 0x000000000126bd70 ***
Aborted (core dumped)
并且输出图像错误。我一直在努力解决这个问题。这是代码:
内核:
__global__ void add(unsigned char* a, unsigned char* b, unsigned char* c, int numCols, int numWidth) {
int i = blockIdx.x * blockDim.x + threadIdx.x; //Column
int j = blockIdx.y * blockDim.y + threadIdx.y; //Row
if(i < numWidth && j < numCols)
{
int idx = j * numCols + i;
c[idx] = b[idx] - a[idx];
}
}
和主要功能:
int main() {
CImg<unsigned char> img1("1.bmp");
CImg<unsigned char> img2("2.bmp");
//both images have the same size
int width = img1.width();
int height = img1.height();
int size = width * height * 3; //both images of same size
dim3 blockSize(16, 16, 1);
dim3 gridSize((width + blockSize.x - 1) / blockSize.x, (height + blockSize.y - 1) / blockSize.y, 1);
unsigned char *dev_a, *dev_b, *dev_c;
cudaMalloc((void**)&dev_a, size * (sizeof(unsigned char)));
cudaMalloc((void**)&dev_b, size * (sizeof(unsigned char)));
cudaMalloc((void**)&dev_c, size * (sizeof(unsigned char)));
cudaMemcpy(dev_a, img1, size * (sizeof(unsigned char)), cudaMemcpyHostToDevice);
cudaMemcpy(dev_b, img2, size * (sizeof(unsigned char)), cudaMemcpyHostToDevice);
add<<<gridSize, blockSize>>>(dev_a, dev_b, dev_c, height, width);
cudaMemcpy(img2, dev_c, size * (sizeof(unsigned char)), cudaMemcpyDeviceToHost);
img2.save("out.bmp");
cudaFree(dev_a);
cudaFree(dev_b);
cudaFree(dev_c);
return 0;
}
图像加载了 CImg 库。
问题在于在主机代码中错误地使用了 cimg 容器。根据 documentation,图像数据指针是通过 data()
方法访问的,这意味着主机代码中的 cudaMemcpy
调用应提供 img1.data()
和 img2.data()
.
[此答案根据评论汇总并添加为社区 wiki 条目]