Pycuda - 如何添加 -ccbin clang-3.8

Pycuda - How to add -ccbin clang-3.8

我目前正在尝试在 Debian 9 上使用 PyCUDA。我已经设法让 cuda 工作了,如果我 运行:

nvcc -ccbin clang-3.8 file.cu

我正确地编译了文件并且我能够运行它。

然而,在我使用

安装 pycuda 之后
apt-get install python-pycuda

和运行他们网站上的一个简单示例:

import pycuda.autoinit
import pycuda.driver as drv
import numpy

from pycuda.compiler import SourceModule
mod = SourceModule("""
__global__ void multiply_them(float *dest, float *a, float *b)
{
   const int i = threadIdx.x;
   dest[i] = a[i] * b[i];
}
""")

multiply_them = mod.get_function("multiply_them")

a = numpy.random.randn(400).astype(numpy.float32)
b = numpy.random.randn(400).astype(numpy.float32)

dest = numpy.zeros_like(a)
multiply_them(
         drv.Out(dest), drv.In(a), drv.In(b),
         block=(400,1,1), grid=(1,1))
print dest-a*b

但我收到以下错误:

CompileError                              Traceback (most recent call last)
<ipython-input-1-8e16128de7f2> in <module>()
     10   dest[i] = a[i] * b[i];
     11 }
---> 12 """)
     13 
     14 multiply_them = mod.get_function("multiply_them")

/usr/lib/python2.7/dist-packages/pycuda/compiler.pyc in __init__(self, source, nvcc, options, keep, no_extern_c, arch, code, cache_dir, include_dirs)
    263 
    264         cubin = compile(source, nvcc, options, keep, no_extern_c,
--> 265                 arch, code, cache_dir, include_dirs)
    266 
    267         from pycuda.driver import module_from_buffer

/usr/lib/python2.7/dist-packages/pycuda/compiler.pyc in compile(source, nvcc, options, keep, no_extern_c, arch, code, cache_dir, include_dirs, target)
    253         options.append("-I"+i)
    254 
--> 255     return compile_plain(source, options, keep, nvcc, cache_dir, target)
    256 
    257 

/usr/lib/python2.7/dist-packages/pycuda/compiler.pyc in compile_plain(source, options, keep, nvcc, cache_dir, target)
    135         raise CompileError("nvcc compilation of %s failed" % cu_file_path,
    136                 cmdline, stdout=stdout.decode("utf-8", "replace"),
--> 137                 stderr=stderr.decode("utf-8", "replace"))
    138 
    139     if stdout or stderr:

CompileError: nvcc compilation of /tmp/tmpVgfyrm/kernel.cu failed
[command: nvcc --cubin -arch sm_61 -I/usr/local/lib/python2.7/dist-packages/pycuda-2017.1.1-py2.7-linux-x86_64.egg/pycuda/cuda kernel.cu]
[stderr:
ERROR: No supported gcc/g++ host compiler found, but clang-3.8 is available.
       Use 'nvcc -ccbin clang-3.8' to use that instead.
]

有谁知道如何将 -ccbin clang-3.8 添加到 pycuda 吗??

根据 documentation,您可以通过两种方式为 nvcc 指定编译器选项

  1. 通过 PYCUDA_DEFAULT_NVCC_FLAGS 环境变量设置默认编译器选项。
  2. 通过使用 options= 关键字
  3. 传递的列表为给定 SourceModule 设置编译器选项

对于遇到问题的每个人,解决方案是给定的 b talonmies,使用选项参数。我使用的代码如下:

import pycuda.autoinit
import pycuda.driver as drv
import numpy

from pycuda.compiler import SourceModule
mod = SourceModule("""
__global__ void multiply_them(float *dest, float *a, float *b)
{
   const int i = threadIdx.x;
   dest[i] = a[i] * b[i];
}
""", options=["-ccbin","clang-3.8"])

multiply_them = mod.get_function("multiply_them")

a = numpy.random.randn(400).astype(numpy.float32)
b = numpy.random.randn(400).astype(numpy.float32)

dest = numpy.zeros_like(a)
multiply_them(
         drv.Out(dest), drv.In(a), drv.In(b),
         block=(400,1,1), grid=(1,1))
print dest-a*b

或使用:

pycuda.compiler.DEFAULT_NVCC_FLAG = ["-ccbin","clang-3.8"]