numba jitted 函数中集合的正确签名是什么？

Question

如果我理解正确，我可以通过添加签名来提高 numba 函数的性能。示例：

@njit(int32(int32, int32))
def f(x, y):
    # A somewhat trivial example
    return x + y

现在我有了需要两组的函数。正确的签名是什么？

@njit(int32(set(int32), set(int32)))
def f(set_1, set_2):
    # A somewhat trivial example
    return x

我认为签名 (int32(set(int32), set(int32))) 可能是正确的，但没有任何反应。 print(numba.typeof(set_1)) returns reflected set(int32)

Answer 1

If I understand correctly I can increase the performance of a numba function by adding a signature.

这是错误的 - 或者只是部分正确。使用签名 numba 只是提前编译函数，而不是在第一次使用这些参数调用时。第一次通话后，两者都应该同样快。在某些情况下，函数在没有签名的情况下可能会稍微快一些（特别是对于 numba 可以使用输入的数组对齐的数组）。

Now I have function which takes two sets. What is the correct signature?

包含整数的 Python 集合的正确签名是：

numba.types.Set(numba.int64, reflected=True)

因此，采用两组（和 return 一组）的函数的签名将是：

import numba as nb

reflected_int_set = nb.types.Set(nb.int64, reflected=True)

@nb.njit(reflected_int_set(reflected_int_set, reflected_int_set))
def f(set_1, set_2):
    return set_1

>>> f({1,2,3}, {3,4,5})
{1, 2, 3}

但由于它（很可能）不会提高性能，所以我根本不会为签名烦恼。

还有一个警告：numba 会在内部将 Python 集转换为 numba 集，因此将 Python set 传递给 numba 函数或 returning从 numba 函数到 Python 上下文的 set 将复制完整的集合。在大多数情况下，开销比 numba 提供的潜在加速要重要得多。

根据我的经验，sets 和 lists with numba 只有在严格限于 numba 函数时才有意义。因此，如果您将它们用作参数或 return 它们（对于非 numba functions/contexts），您必须测量性能并检查您是否真的获得了加速。

numba jitted 函数中集合的正确签名是什么？

What is the correct signature for sets in numba jitted functions?

python

jit

set

signature

numba