为什么这个微小的 Numba CUDA 内核无法 运行?
Why does this tiny Numba CUDA kernel fail to run?
我有一个小内核,它展示了我遇到的问题:
import numpy as np
from numba import cuda, types
@cuda.jit(device=True, debug=True)
def mutate_genome(instruction_positions):
return 0
@cuda.jit
def generate_mutants():
instruction_positions = cuda.local.array(500, np.int64)
mutate_genome(instruction_positions)
if __name__ == "__main__":
generate_mutants[1, 1]()
基本上它所做的就是分配一些 int32 类型的本地内存,并调用一个获取这些本地内存数组的函数。
但是当我 运行 使用 cuda-memcheck 的代码时:
cuda-memcheck python xtests.py
它失败了:
========= CUDA-MEMCHECK
Traceback (most recent call last):
File "xtests.py", line 18, in <module>
generate_mutants[1, 1]()
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 804, in __call__
kernel = self.specialize(*args)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 815, in specialize
kernel = self.compile(argtypes)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 831, in compile
**self.targetoptions)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_lock.py", line 32, in _acquire_compile_lock
return func(*args, **kwargs)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 61, in compile_kernel
cres = compile_cuda(pyfunc, types.void, args, debug=debug, inline=inline)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_lock.py", line 32, in _acquire_compile_lock
return func(*args, **kwargs)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 50, in compile_cuda
locals={})
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 551, in compile_extra
return pipeline.compile_extra(func)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 331, in compile_extra
return self._compile_bytecode()
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 393, in _compile_bytecode
return self._compile_core()
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 373, in _compile_core
raise e
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 364, in _compile_core
pm.run(self.state)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_machinery.py", line 347, in run
raise patched_exception
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_machinery.py", line 338, in run
self._runPass(idx, pass_inst, state)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_lock.py", line 32, in _acquire_compile_lock
return func(*args, **kwargs)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_machinery.py", line 302, in _runPass
mutated |= check(pss.run_pass, internal_state)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_machinery.py", line 275, in check
mangled = func(compiler_state)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/typed_passes.py", line 95, in run_pass
raise_errors=self._raise_errors)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/typed_passes.py", line 67, in type_inference_stage
infer.propagate(raise_errors=raise_errors)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/typeinfer.py", line 985, in propagate
raise errors[0]
numba.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Internal error at <numba.typeinfer.CallConstraint object at 0x7f77bf6bb850>.
type object 'numpy.int64' has no attribute 'is_precise'
[1] During: resolving callee type: Function(<numba.cuda.compiler.DeviceFunctionTemplate object at 0x7f772a8e0210>)
[2] During: typing of call at xtests.py (14)
Enable logging at debug level for details.
File "xtests.py", line 14:
def generate_mutants():
<source elided>
mutate_genome(instruction_positions)
^
我在 Linux Mint,Python 3.8,Numba 0.50。
谁能发现我做错了什么?
我发现如果我在创建本地内存分配时使用 numba.types.int64
而不是 np.int64
那么一切都有效。
我想那里不支持 numpy 类型。
我有一个小内核,它展示了我遇到的问题:
import numpy as np
from numba import cuda, types
@cuda.jit(device=True, debug=True)
def mutate_genome(instruction_positions):
return 0
@cuda.jit
def generate_mutants():
instruction_positions = cuda.local.array(500, np.int64)
mutate_genome(instruction_positions)
if __name__ == "__main__":
generate_mutants[1, 1]()
基本上它所做的就是分配一些 int32 类型的本地内存,并调用一个获取这些本地内存数组的函数。
但是当我 运行 使用 cuda-memcheck 的代码时:
cuda-memcheck python xtests.py
它失败了:
========= CUDA-MEMCHECK
Traceback (most recent call last):
File "xtests.py", line 18, in <module>
generate_mutants[1, 1]()
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 804, in __call__
kernel = self.specialize(*args)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 815, in specialize
kernel = self.compile(argtypes)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 831, in compile
**self.targetoptions)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_lock.py", line 32, in _acquire_compile_lock
return func(*args, **kwargs)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 61, in compile_kernel
cres = compile_cuda(pyfunc, types.void, args, debug=debug, inline=inline)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_lock.py", line 32, in _acquire_compile_lock
return func(*args, **kwargs)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 50, in compile_cuda
locals={})
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 551, in compile_extra
return pipeline.compile_extra(func)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 331, in compile_extra
return self._compile_bytecode()
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 393, in _compile_bytecode
return self._compile_core()
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 373, in _compile_core
raise e
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 364, in _compile_core
pm.run(self.state)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_machinery.py", line 347, in run
raise patched_exception
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_machinery.py", line 338, in run
self._runPass(idx, pass_inst, state)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_lock.py", line 32, in _acquire_compile_lock
return func(*args, **kwargs)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_machinery.py", line 302, in _runPass
mutated |= check(pss.run_pass, internal_state)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_machinery.py", line 275, in check
mangled = func(compiler_state)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/typed_passes.py", line 95, in run_pass
raise_errors=self._raise_errors)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/typed_passes.py", line 67, in type_inference_stage
infer.propagate(raise_errors=raise_errors)
File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/typeinfer.py", line 985, in propagate
raise errors[0]
numba.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Internal error at <numba.typeinfer.CallConstraint object at 0x7f77bf6bb850>.
type object 'numpy.int64' has no attribute 'is_precise'
[1] During: resolving callee type: Function(<numba.cuda.compiler.DeviceFunctionTemplate object at 0x7f772a8e0210>)
[2] During: typing of call at xtests.py (14)
Enable logging at debug level for details.
File "xtests.py", line 14:
def generate_mutants():
<source elided>
mutate_genome(instruction_positions)
^
我在 Linux Mint,Python 3.8,Numba 0.50。
谁能发现我做错了什么?
我发现如果我在创建本地内存分配时使用 numba.types.int64
而不是 np.int64
那么一切都有效。
我想那里不支持 numpy 类型。