即使有签名,Numba 也无法确定空列表的指纹

Numba cannot determine fingerprint of empty list even with signature

我正在使用@jit 签名来定义传入参数的类型。但是在调用函数时我得到:

ValueError: cannot compute fingerprint of empty list

我知道列表是空的,但我的签名定义了它,所以我不确定为什么 Numba 不使用该签名。

我尝试了不同形式的签名(字符串形式和元组形式),但仍然报错。从文档中我不清楚为什么这些签名没有定义传入的参数并且它仍然依赖于推断类型。

@nb.jit("void(List(int64), int64, List(List(int64)))", nopython=True, cache=True)
def _set_indices(keys_as_int, n_keys, indices):
    for i, k in enumerate(keys_as_int):
        indices[k].append(i)
    indices = [([np.array(elt) for elt in indices])]

def group_by(keys):
    _, first_occurrences, keys_as_int = np.unique(keys, return_index=True, return_inverse=True)
    n_keys = max(keys_as_int) + 1
    indices = [[] for _ in range(max(keys_as_int) + 1)]
    print(str(keys_as_int) + str(n_keys) + str(indices))
    _set_indices(keys_as_int, n_keys, indices)
    return indices

result = group_by(['aaa', 'aab', 'aac', 'aaa', 'aac'])
print(str(result))

我希望签名能够对传入参数强制执行数据类型化,而无需推断数据类型。 实际错误

<ipython-input-274-401e07cd4e63> in <module>
----> 1 result = group_by(['aaa', 'aab', 'aac', 'aaa', 'aac'])
      2 print(str(result))

<ipython-input-273-acdebb81069c> in group_by(keys)
      4     indices = [[] for _ in range(max(keys_as_int) + 1)]
      5     print(str(keys_as_int) + str(n_keys) + str(indices))
----> 6     _set_indices(keys_as_int, n_keys, indices)
      7     return indices

ValueError: cannot compute fingerprint of empty list

所以我找到了一种解决方法来让您的代码正常工作。这里有一个github issue with almost the same problem as you are facing. So I tried to create a List with a dummy value -1 which will be dropped towards the end. However, I ran into an 'reflected List exception`. You can read about it more here. So I had to use Numba's typed-list. You can check more about this data type here。长话短说,这是在 No Python 模式下工作的最终代码,returns 是您期望的正确结果。

import numba as nb
import numpy as np
from numba.typed import List

@nb.jit(nopython=True, cache=True)
def _set_indices(keys_as_int, n_keys, indices):
    # Do some operation
    for i, k in enumerate(keys_as_int):
        indices[k].append(i)

    # Drop the dummy element in the final result
    indices = [elem[1:] for elem in indices]

    # Return the final indices
    return indices

def group_by(keys):
    _, first_occurrences, keys_as_int = np.unique(keys, return_index=True,
                                                  return_inverse=True)
    n_keys = max(keys_as_int) + 1

    # Simply adding the dummy element doesn't work here
    # Error: cannot reflect element of reflected container: reflected list(reflected list(int64))
    # indices = [[-1] for _ in range(max(keys_as_int) + 1)]
    # A workaround is to create Numba's version of typed-list
    indices = List()
    for i in range(max(keys_as_int) + 1):
        l = List()
        l.append(-1)
        indices.append(l)

    print(str(keys_as_int), str(n_keys),  str(indices))
    indices = _set_indices(keys_as_int, n_keys, indices)
    return indices

result = group_by(['aaa', 'aab', 'aac', 'aaa', 'aac'])

# Conversion of Numba's typed list inside NoPython mode returns error
# Hence do it outside the function
result = [np.asarray(elem) for elem in result]
print(result)

这里是 link to Google colab notebook 的工作代码。如果您想深入了解反射列表异常​​,请转到最后一个单元格。