Numba - nopython 模式是否支持元组列表?

Numba - does nopython mode support list of tuples?

我想澄清一下,这是我第一次使用 Numba,所以我离专家还很远。我正在尝试手动实现一个简单的 KNN,代码如下:

@jit(nopython=True)
def knn(training_set, test_set):
for q in range(len(test_set)):
    indexes = [-1]
    values = [np.inf]
    thres = values[-1]

    for u in range(len(training_set)):
        dist = 0
        flag = False
        dist = knn_dist(training_set[u], test_set[q], thres)
        if dist == 0:
            flag = True
        if not flag:

            '''
            Binary search to obtain the index
            '''    

            # Various code

return

现在想用numba的nopython模式优化代码,错误部分如下:

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
 in _call_incref_decref(self, builder, root_type, typ, value, funcname, getters)
    185             try:
--> 186                 meminfo = data_model.get_nrt_meminfo(builder, value)
    187             except NotImplementedError as e:

 in get_nrt_meminfo(self, builder, value)
    328                 raise NotImplementedError(
--> 329                     "unsupported nested memory-managed object")
    330         return value

NotImplementedError: unsupported nested memory-managed object

训练集和测试集都是元组列表的列表,我想知道nopython是否支持这种数据结构,如果不支持(看起来),我可以使用哪种数据结构来实现它?我是否被迫更改 numba 模式?

P.s。为了更好地说明,training/test 的示例如下:

[[(0, 1), (1, 1), (2, 1), (3, 2), (4, 5)], [(0, 2), (1, 4), (2, 3), (3, 4), (4, 2)], [(0, 5), (1, 4), (2, 3), (3, 4), (4, 2)], [(0, 6), (1, 5), (2, 4), (3, 3), (4, 2)], [(0, 0), (1, 9), (2, 8), (3, 9), (4, 8)], [(0, 5), (1, 4), (2, 3), (3, 4), (4, 2)]]

does nopython mode support list of tuples?

是的,确实如此。但是,正如您的错误消息所暗示的,不是 nested 列表。

Am I forced to change numba mode?

不,你不是。


您可以轻松地将元组列表 L 转换为常规 NumPy 数组:

L_arr = np.array(L)

这里有一个演示以及您如何自己测试:

from numba import jit

L = [[(0, 1), (1, 1), (2, 1), (3, 2), (4, 5)], [(0, 2), (1, 4), (2, 3), (3, 4), (4, 2)],
     [(0, 5), (1, 4), (2, 3), (3, 4), (4, 2)], [(0, 6), (1, 5), (2, 4), (3, 3), (4, 2)],
     [(0, 0), (1, 9), (2, 8), (3, 9), (4, 8)], [(0, 5), (1, 4), (2, 3), (3, 4), (4, 2)]]

L_arr = np.array(L)

@jit(nopython=True)
def foo(x):
    return x

使用 L 这会产生错误:

print(foo(L))

LoweringError: Failed at nopython (nopython mode backend)
reflected list(reflected list((int64 x 2))): unsupported nested memory-managed object

使用 L_arr,你有一个形状为 (6, 5, 2):

的 3 维 NumPy 数组
print(foo(L_arr))

array([[[0, 1],
        [1, 1],
        [2, 1],
        [3, 2],
        [4, 5]],
        ...
       [[0, 5],
        [1, 4],
        [2, 3],
        [3, 4],
        [4, 2]]])

然后您可能希望重构您的逻辑以更有效地使用 NumPy 数组而不是嵌套的元组列表。