为什么在使用 CUDA 时要使用 memset？

Question

我在 CUDA 代码示例中看到 memset 用于将向量初始化为全 0，这将存储另外两个向量的总和。例如：

hostRef = (float *)malloc(nBytes);
gpuRef = (float *)malloc(nBytes);    
memset(hostRef, 0, nBytes);
memset(gpuRef, 0, nBytes);

如果不对这些向量做任何其他事情，这有什么用？

您可以在此处查看代码： https://books.google.com/books?id=Jgx_BAAAQBAJ&pg=PA42#v=onepage&q&f=false

不确定 link 能工作多久。

Answer 1

当你使用'malloc'获取内存时，它不一定是空的，只有'calloc'会为你清零内存。出于理智和调试目的，建议初始化您的内存。

Answer 2

如果不对这些向量做任何其他事情，那将毫无用处，但事实并非如此。

代码运行一个CUDA向量求和，然后将结果复制到*gpuRef。然后它在主机 CPU 上执行相同的求和，并将结果放入 *hostRef。最后，它比较了两个结果。

当然，在将新数据复制到其中之前，它不会对任何一个数组执行任何操作，因此初始化为零仍然没有用。

Answer 3

这是njuffa在评论中给出的答案：

...The content of GPU memory doesn't change between invocations of the application. In case of a program failure, we would want to avoid picking up good data from a previous run, which may lead (erroneously) to a belief that the program executed fine. I have seen such cases in real-life, and it was very confusing to the affected programmers. Thus it is better to initialize result data to a known value, although I would have chosen 0xff instead of 0 as this corresponds to a NaN pattern for floating-point data.

为什么在使用 CUDA 时要使用 memset？

Why use memset when using CUDA?

c

cuda

nvidia