为什么 numba 执行 numpy 计算比执行正常 python 代码花费更长的时间?
Why is numba taking longer time to execute numpy calculations than executing normal python code?
parallel.py 是一个 python 文件,它使用 numba 和 numpy 来计算两个矩阵的对角线之和。这里的主要目的是找到使用 numba 的执行速度。 parallel.py 大约需要 0.55 秒才能完成执行,而另一个文件 (sequencial.py) 中的相同代码,用纯 python 编写需要 0.00 秒才能完成解决同样的问题,就是这样讽刺的。
我不确定我是否很好地利用了 numba,有人可以建议我需要做什么来实现我的 objective.
parallel.py
从 numba 导入 jit,njit
将 numpy 导入为 np
导入时间
@jit(nopython=True)
def create_matrix(row, col):
arr = np.zeros((row, col))
for i in range(row):
for j in range(1, col + 1):
arr[i, j - 1] = j + (col * i)
return arr
print("FIND THE SUM OF PRIMARY DIAGONALS OF ANY TWO MATRICES: ")
start = time.perf_counter()
# calculate the sum of primary diagonals of matrix1
m1 = create_matrix(4, 4) # you can adjust the size of the matrix by changing the row and column in brackets
print(f"Matrix 1 : {m1}")
print(f"Matrix 1 diagonal: {np.diagonal(m1)}")
print(f"Matrix 1 sum of primary diagonal is : {np.trace(m1)}")
mat1_sum = np.trace(m1)
# calculate the sum of primary diagonals of matrix2
m2 = create_matrix(4, 4) # you can adjust the size of the matrix by changing the row and column in brackets
print(f"Matrix 2 : {m2}")
print(f"Matrix 2 diagonal : {np.diagonal(m2)}")
print(f"Matrix 2 Sum of diagonal is : {np.trace(m2)}")
mat2_sum = np.trace(m2, dtype='i')
sum_of_two_diagonals = mat1_sum + mat2_sum
print(f"THE SUM IS : {sum_of_two_diagonals}")
finish = time.perf_counter()
print(f"Finished in {round(finish - start, 2)} seconds(s)")
sequencial.py
import numpy as np
import time
def create_matrix(row, col):
arr = np.zeros((row, col))
for i in range(row):
for j in range(1, col + 1):
arr[i, j - 1] = j + (col * i)
return arr
print("FIND THE SUM OF PRIMARY DIAGONALS OF ANY TWO MATRICES: ")
start = time.perf_counter()
# calculate the sum of primary diagonals of matrix1
mat_1 = create_matrix(4, 4) # you can adjust the size of the matrix by changing the row and column in brackets
print(f"Matrix 1 : {mat_1}")
mat1_sum_of_primary_diagonal = 0
for i in range(len(mat_1)):
for j in range(len(mat_1[i])):
if i == j:
print(mat_1[i][j])
mat1_sum_of_primary_diagonal = mat1_sum_of_primary_diagonal + mat_1[i][j]
print(f"Matrix 1 sum of diagnals is: {mat1_sum_of_primary_diagonal}")
# calculate the sum of primary diagonals of matrix2
mat_2 = create_matrix(4, 4) # you can adjust the size of the matrix by changing the row and column in brackets
print(f"Matrix 1 : {mat_2}")
mat2_sum_of_primary_diagonal = 0
for i in range(len(mat_2)):
for j in range(len(mat_2[i])):
if i == j:
print(mat_2[i][j])
mat2_sum_of_primary_diagonal = mat2_sum_of_primary_diagonal + mat_2[i][j]
print(f"Matrix 1 sum of diagnals is: {mat2_sum_of_primary_diagonal}")
diagonals_total = mat1_sum_of_primary_diagonal + mat2_sum_of_primary_diagonal
print(f"THE SUM IS : {diagonals_total}")
finish = time.perf_counter()
print(f"Finished in {round(finish - start, 2)} seconds(s)")
Numba 函数的编译时间包含在基准测试中,因为 Numba 使用 lazy compilation。您可以只指定函数参数的类型来急切地编译它。或者,您可以 运行 两次基准测试,只考虑第二次 运行.
这是一个例子:
import numba as nb
@nb.njit('float64[:,::1](int_, int_)')
def create_matrix(row, col):
arr = np.zeros((row, col))
for i in range(row):
for j in range(1, col + 1):
arr[i, j - 1] = j + (col * i)
return arr
此外,请注意,最好不要在基准计时中包含 print
调用(因为时间可能不稳定,这可能不是您想要衡量的)。更不用说打印东西通常很慢(与基本计算相比)。
最后,请注意该脚本名为“parallel.py”,但不应并行执行任何操作,因为默认情况下 Numba 不会并行化代码(并且由于开销,它在您的情况下会更慢创建线程)。
parallel.py 是一个 python 文件,它使用 numba 和 numpy 来计算两个矩阵的对角线之和。这里的主要目的是找到使用 numba 的执行速度。 parallel.py 大约需要 0.55 秒才能完成执行,而另一个文件 (sequencial.py) 中的相同代码,用纯 python 编写需要 0.00 秒才能完成解决同样的问题,就是这样讽刺的。 我不确定我是否很好地利用了 numba,有人可以建议我需要做什么来实现我的 objective.
parallel.py 从 numba 导入 jit,njit 将 numpy 导入为 np 导入时间
@jit(nopython=True)
def create_matrix(row, col):
arr = np.zeros((row, col))
for i in range(row):
for j in range(1, col + 1):
arr[i, j - 1] = j + (col * i)
return arr
print("FIND THE SUM OF PRIMARY DIAGONALS OF ANY TWO MATRICES: ")
start = time.perf_counter()
# calculate the sum of primary diagonals of matrix1
m1 = create_matrix(4, 4) # you can adjust the size of the matrix by changing the row and column in brackets
print(f"Matrix 1 : {m1}")
print(f"Matrix 1 diagonal: {np.diagonal(m1)}")
print(f"Matrix 1 sum of primary diagonal is : {np.trace(m1)}")
mat1_sum = np.trace(m1)
# calculate the sum of primary diagonals of matrix2
m2 = create_matrix(4, 4) # you can adjust the size of the matrix by changing the row and column in brackets
print(f"Matrix 2 : {m2}")
print(f"Matrix 2 diagonal : {np.diagonal(m2)}")
print(f"Matrix 2 Sum of diagonal is : {np.trace(m2)}")
mat2_sum = np.trace(m2, dtype='i')
sum_of_two_diagonals = mat1_sum + mat2_sum
print(f"THE SUM IS : {sum_of_two_diagonals}")
finish = time.perf_counter()
print(f"Finished in {round(finish - start, 2)} seconds(s)")
sequencial.py
import numpy as np
import time
def create_matrix(row, col):
arr = np.zeros((row, col))
for i in range(row):
for j in range(1, col + 1):
arr[i, j - 1] = j + (col * i)
return arr
print("FIND THE SUM OF PRIMARY DIAGONALS OF ANY TWO MATRICES: ")
start = time.perf_counter()
# calculate the sum of primary diagonals of matrix1
mat_1 = create_matrix(4, 4) # you can adjust the size of the matrix by changing the row and column in brackets
print(f"Matrix 1 : {mat_1}")
mat1_sum_of_primary_diagonal = 0
for i in range(len(mat_1)):
for j in range(len(mat_1[i])):
if i == j:
print(mat_1[i][j])
mat1_sum_of_primary_diagonal = mat1_sum_of_primary_diagonal + mat_1[i][j]
print(f"Matrix 1 sum of diagnals is: {mat1_sum_of_primary_diagonal}")
# calculate the sum of primary diagonals of matrix2
mat_2 = create_matrix(4, 4) # you can adjust the size of the matrix by changing the row and column in brackets
print(f"Matrix 1 : {mat_2}")
mat2_sum_of_primary_diagonal = 0
for i in range(len(mat_2)):
for j in range(len(mat_2[i])):
if i == j:
print(mat_2[i][j])
mat2_sum_of_primary_diagonal = mat2_sum_of_primary_diagonal + mat_2[i][j]
print(f"Matrix 1 sum of diagnals is: {mat2_sum_of_primary_diagonal}")
diagonals_total = mat1_sum_of_primary_diagonal + mat2_sum_of_primary_diagonal
print(f"THE SUM IS : {diagonals_total}")
finish = time.perf_counter()
print(f"Finished in {round(finish - start, 2)} seconds(s)")
Numba 函数的编译时间包含在基准测试中,因为 Numba 使用 lazy compilation。您可以只指定函数参数的类型来急切地编译它。或者,您可以 运行 两次基准测试,只考虑第二次 运行.
这是一个例子:
import numba as nb
@nb.njit('float64[:,::1](int_, int_)')
def create_matrix(row, col):
arr = np.zeros((row, col))
for i in range(row):
for j in range(1, col + 1):
arr[i, j - 1] = j + (col * i)
return arr
此外,请注意,最好不要在基准计时中包含 print
调用(因为时间可能不稳定,这可能不是您想要衡量的)。更不用说打印东西通常很慢(与基本计算相比)。
最后,请注意该脚本名为“parallel.py”,但不应并行执行任何操作,因为默认情况下 Numba 不会并行化代码(并且由于开销,它在您的情况下会更慢创建线程)。