np.nan 和 np.inf 的 Numba 性能问题
Numba performance issue with np.nan and np.inf
我正在尝试 numba
来加速我的代码。我注意到在函数内部使用 np.inf
而不是 np.nan
时性能会有很大差异。下面我附上了三个示例函数以供说明。
function1
未被 numba
加速。
function2
和 function3
都被 numba
加速,但一个使用 np.nan
而另一个使用 np.inf
.
在我的机器上,三个函数的平均运行时间分别是0.032284s
、0.041548s
和0.019712s
。似乎使用 np.nan
比 np.inf
慢得多。为什么性能差异很大?提前致谢。
编辑:我正在使用 Python 3.7.11
和 Numba 0.55.Orc1
。
import numpy as np
import numba as nb
def function1(array1, array2):
nr, nc = array1.shape
output1 = np.empty((nr, nc), dtype='float')
output2 = np.empty((nr, nc), dtype='float')
output1[:] = np.nan
output2[:] = np.nan
for r in range(nr):
row1 = array1[r]
row2 = array2[r]
diff = row1 - row2
id_threshold =np.nonzero( (row1 - row2) > 8 )
output1[r][id_threshold] = 1
output2[r][id_threshold] = 0
output1 = output1.flatten()
output2 = output2.flatten()
id_keep = np.nonzero(output1 != np.nan)
output1 = output1[id_keep]
output2 = output2[id_keep]
output = np.vstack((output1, output2))
return output
@nb.njit('float64[:,::1](float64[:,::1], float64[:,::1])', parallel=True)
def function2(array1, array2):
nr, nc = array1.shape
output1 = np.empty((nr,nc), dtype='float')
output2 = np.empty((nr, nc), dtype='float')
output1[:] = np.nan
output2[:] = np.nan
for r in nb.prange(nr):
row1 = array1[r]
row2 = array2[r]
diff = row1 - row2
id_threshold =np.nonzero( (row1 - row2) > 8 )
output1[r][id_threshold] = 1
output2[r][id_threshold] = 0
output1 = output1.flatten()
output2 = output2.flatten()
id_keep = np.nonzero(output1 != np.nan)
output1 = output1[id_keep]
output2 = output2[id_keep]
output = np.vstack((output1, output2))
return output
@nb.njit('float64[:,::1](float64[:,::1], float64[:,::1])', parallel=True)
def function3(array1, array2):
nr, nc = array1.shape
output1 = np.empty((nr,nc), dtype='float')
output2 = np.empty((nr, nc), dtype='float')
output1[:] = np.inf
output2[:] = np.inf
for r in nb.prange(nr):
row1 = array1[r]
row2 = array2[r]
diff = row1 - row2
id_threshold =np.nonzero( (row1 - row2) > 8 )
output1[r][id_threshold] = 1
output2[r][id_threshold] = 0
output1 = output1.flatten()
output2 = output2.flatten()
id_keep = np.nonzero(output1 != np.inf)
output1 = output1[id_keep]
output2 = output2[id_keep]
output = np.vstack((output1, output2))
return output
array1 = 10*np.random.random((1000,1000))
array2 = 10*np.random.random((1000,1000))
output1 = function1(array1, array2)
output2 = function2(array1, array2)
output3 = function3(array1, array2)
第二个要慢得多,因为 output1 != np.nan
returns 一个副本 output1
因为 np.nan != np.nan
是 True
(就像任何其他值一样 - v != np.nan
总是正确的)。因此,要计算的结果数组要大得多,导致执行速度变慢。
重点是您必须 永远不要使用比较运算符将值与 np.nan
进行比较:请改用 np.isnan(value)
。在你的情况下,你应该使用 np.logical_not(np.isnan(output1))
.
由于 np.logical_not
创建的临时数组,第二个实现可能会稍微慢一些(一旦代码被更正,我在我的机器上没有看到使用 NaN 或 Inf 之间有任何统计上的显着差异)。
我正在尝试 numba
来加速我的代码。我注意到在函数内部使用 np.inf
而不是 np.nan
时性能会有很大差异。下面我附上了三个示例函数以供说明。
function1
未被numba
加速。function2
和function3
都被numba
加速,但一个使用np.nan
而另一个使用np.inf
.
在我的机器上,三个函数的平均运行时间分别是0.032284s
、0.041548s
和0.019712s
。似乎使用 np.nan
比 np.inf
慢得多。为什么性能差异很大?提前致谢。
编辑:我正在使用 Python 3.7.11
和 Numba 0.55.Orc1
。
import numpy as np
import numba as nb
def function1(array1, array2):
nr, nc = array1.shape
output1 = np.empty((nr, nc), dtype='float')
output2 = np.empty((nr, nc), dtype='float')
output1[:] = np.nan
output2[:] = np.nan
for r in range(nr):
row1 = array1[r]
row2 = array2[r]
diff = row1 - row2
id_threshold =np.nonzero( (row1 - row2) > 8 )
output1[r][id_threshold] = 1
output2[r][id_threshold] = 0
output1 = output1.flatten()
output2 = output2.flatten()
id_keep = np.nonzero(output1 != np.nan)
output1 = output1[id_keep]
output2 = output2[id_keep]
output = np.vstack((output1, output2))
return output
@nb.njit('float64[:,::1](float64[:,::1], float64[:,::1])', parallel=True)
def function2(array1, array2):
nr, nc = array1.shape
output1 = np.empty((nr,nc), dtype='float')
output2 = np.empty((nr, nc), dtype='float')
output1[:] = np.nan
output2[:] = np.nan
for r in nb.prange(nr):
row1 = array1[r]
row2 = array2[r]
diff = row1 - row2
id_threshold =np.nonzero( (row1 - row2) > 8 )
output1[r][id_threshold] = 1
output2[r][id_threshold] = 0
output1 = output1.flatten()
output2 = output2.flatten()
id_keep = np.nonzero(output1 != np.nan)
output1 = output1[id_keep]
output2 = output2[id_keep]
output = np.vstack((output1, output2))
return output
@nb.njit('float64[:,::1](float64[:,::1], float64[:,::1])', parallel=True)
def function3(array1, array2):
nr, nc = array1.shape
output1 = np.empty((nr,nc), dtype='float')
output2 = np.empty((nr, nc), dtype='float')
output1[:] = np.inf
output2[:] = np.inf
for r in nb.prange(nr):
row1 = array1[r]
row2 = array2[r]
diff = row1 - row2
id_threshold =np.nonzero( (row1 - row2) > 8 )
output1[r][id_threshold] = 1
output2[r][id_threshold] = 0
output1 = output1.flatten()
output2 = output2.flatten()
id_keep = np.nonzero(output1 != np.inf)
output1 = output1[id_keep]
output2 = output2[id_keep]
output = np.vstack((output1, output2))
return output
array1 = 10*np.random.random((1000,1000))
array2 = 10*np.random.random((1000,1000))
output1 = function1(array1, array2)
output2 = function2(array1, array2)
output3 = function3(array1, array2)
第二个要慢得多,因为 output1 != np.nan
returns 一个副本 output1
因为 np.nan != np.nan
是 True
(就像任何其他值一样 - v != np.nan
总是正确的)。因此,要计算的结果数组要大得多,导致执行速度变慢。
重点是您必须 永远不要使用比较运算符将值与 np.nan
进行比较:请改用 np.isnan(value)
。在你的情况下,你应该使用 np.logical_not(np.isnan(output1))
.
由于 np.logical_not
创建的临时数组,第二个实现可能会稍微慢一些(一旦代码被更正,我在我的机器上没有看到使用 NaN 或 Inf 之间有任何统计上的显着差异)。