numba 不接受 dtype=object 的 numpy 数组
numba does not accept numpy arrays with dtype=object
我有一个空数组,我想在每个索引 [i,j] 处填充任意长度的列表。所以我初始化了一个空数组,它应该包含这样的对象:
@jit(nopython=True, parrallel=True)
def numba_function():
values = np.empty((length, length), dtype=object)
for i in range(10):
for j in range(10):
a_list_of_things = [1,2,3,4]
values[i,j] = a_list_of_things
这失败了:
TypingError: Failed in nopython mode pipeline (step: nopython frontend) Untyped global name 'object': cannot determine Numba type of <class 'type'>
如果我通过设置 nopython=False
关闭 numba,代码工作正常。在 values
数组中设置 dtype=list
并没有改善事情。
有什么聪明的技巧可以克服这个问题吗?
nopython 模式下的 Numba(自版本 0.43.1 起)不支持对象数组。
键入对象数组的正确方法是:
import numba as nb
import numpy as np
@nb.njit
def numba_function():
values = np.empty((2, 2), np.object_)
return values
但如前所述,这不起作用:
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Internal error at resolving type of attribute "object_" of "[=11=].4":
NotImplementedError: object
这个在the numba documentation中也有提到:
2.7.1. Scalar types
Numba supports the following Numpy scalar types:
- Integers: all integers of either signedness, and any width up to 64 bits
- Booleans
- Real numbers: single-precision (32-bit) and double-precision (64-bit) reals
- Complex numbers: single-precision (2x32-bit) and double-precision (2x64-bit) complex numbers
- Datetimes and timestamps: of any unit
- Character sequences (but no operations are available on them)
- Structured scalars: structured scalars made of any of the types above and arrays of the types above
The following scalar types and features are not supported:
- Arbitrary Python objects
- Half-precision and extended-precision real and complex numbers
- Nested structured scalars the fields of structured scalars may not contain other structured scalars
[...]
2.7.2. Array types
Numpy arrays of any of the scalar types above are supported, regardless of the shape or layout.
(强调我的)
由于 dtype=object
允许任意 Python 对象,因此不受支持。而 dtype=list
正好等同于 dtype=object
(documentation)
Built-in Python types
Several python types are equivalent to a corresponding array scalar when used to generate a dtype object:
int np.int_
bool np.bool_
float np.float_
complex np.cfloat
bytes np.bytes_
str np.bytes_ (Python2) or np.unicode_ (Python3)
unicode np.unicode_
buffer np.void
(all others) np.object_
总而言之:拥有适用于 NumPy 数组和 numba 函数的 object
数组会非常慢。每当您选择使用此类 object
数组时,您 隐含地 决定您不想要 high-performance.
所以如果你想要性能并使用 NumPy 数组,那么你需要重写它,这样你就不会先使用对象数组,如果它仍然很慢,那么你可以考虑在 non-object数组。
我有一个空数组,我想在每个索引 [i,j] 处填充任意长度的列表。所以我初始化了一个空数组,它应该包含这样的对象:
@jit(nopython=True, parrallel=True)
def numba_function():
values = np.empty((length, length), dtype=object)
for i in range(10):
for j in range(10):
a_list_of_things = [1,2,3,4]
values[i,j] = a_list_of_things
这失败了:
TypingError: Failed in nopython mode pipeline (step: nopython frontend) Untyped global name 'object': cannot determine Numba type of <class 'type'>
如果我通过设置 nopython=False
关闭 numba,代码工作正常。在 values
数组中设置 dtype=list
并没有改善事情。
有什么聪明的技巧可以克服这个问题吗?
nopython 模式下的 Numba(自版本 0.43.1 起)不支持对象数组。
键入对象数组的正确方法是:
import numba as nb
import numpy as np
@nb.njit
def numba_function():
values = np.empty((2, 2), np.object_)
return values
但如前所述,这不起作用:
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Internal error at resolving type of attribute "object_" of "[=11=].4":
NotImplementedError: object
这个在the numba documentation中也有提到:
2.7.1. Scalar types
Numba supports the following Numpy scalar types:
- Integers: all integers of either signedness, and any width up to 64 bits
- Booleans
- Real numbers: single-precision (32-bit) and double-precision (64-bit) reals
- Complex numbers: single-precision (2x32-bit) and double-precision (2x64-bit) complex numbers
- Datetimes and timestamps: of any unit
- Character sequences (but no operations are available on them)
- Structured scalars: structured scalars made of any of the types above and arrays of the types above
The following scalar types and features are not supported:
- Arbitrary Python objects
- Half-precision and extended-precision real and complex numbers
- Nested structured scalars the fields of structured scalars may not contain other structured scalars
[...]
2.7.2. Array types
Numpy arrays of any of the scalar types above are supported, regardless of the shape or layout.
(强调我的)
由于 dtype=object
允许任意 Python 对象,因此不受支持。而 dtype=list
正好等同于 dtype=object
(documentation)
Built-in Python types
Several python types are equivalent to a corresponding array scalar when used to generate a dtype object:
int np.int_ bool np.bool_ float np.float_ complex np.cfloat bytes np.bytes_ str np.bytes_ (Python2) or np.unicode_ (Python3) unicode np.unicode_ buffer np.void (all others) np.object_
总而言之:拥有适用于 NumPy 数组和 numba 函数的 object
数组会非常慢。每当您选择使用此类 object
数组时,您 隐含地 决定您不想要 high-performance.
所以如果你想要性能并使用 NumPy 数组,那么你需要重写它,这样你就不会先使用对象数组,如果它仍然很慢,那么你可以考虑在 non-object数组。