使用 numpy 避免 3 个 for 循环以获得更高效的代码

Question

我已经开发了一个代码，我正在寻找一种更有效的方法，因为它很慢。是否可以修改此代码以使其更快？

代码解释起来真的很复杂，如果我的解释不尽如人意，请提前道歉：

我正在处理这个包含 7 个尺寸为 50x1000 的矩阵的大矩阵。代码以这种方式工作：

我取每个矩阵的第一个元素（包含在大矩阵中），创建这些元素的列表--> [a1b1, a2b1, ... a7b1]
创建此列表后，我对其进行插值，创建一个包含 50 个元素的新列表
现在，我需要重复点 (1) 和 (2)，但是对于所有矩阵的第二行的第一个元素，直到最后一行的第一个元素。
将第一列的所有元素都取完后，我们可以切换到第二列的所有矩阵，重复点(1),(2),(3)

有什么不明白的地方请告诉我，我会尽力解释的！

import numpy as np
from scipy.interpolate import barycentric_interpolate

matrix = np.random.rand(7, 50, 1000)

X = np.linspace(0.1,0.8,7)
intervals = np.linspace(0.5,1.5,50)

matrix_tot = []
for col in range(len(matrix[0][0])):
    matrix_i = []
    for row in range(len(matrix[0])):
        interp_1 = []
        for m in range(len(matrix)):
            values = matrix[m][row][col]
            interp_1.append(values)
        row_interpolated = barycentric_interpolate(X,np.array(interp_1),intervals)
        matrix_i.append(row_interpolated)
    matrix_tot.append(matrix_i)
matrix_tot = np.array(matrix_tot)
print(matrix_tot.shape)

Answer 1

我在执行以下操作时获得了轻微的性能提升

import numpy as np
from scipy.interpolate import barycentric_interpolate
import time

matrix = np.random.rand(7, 50, 1000)
X = np.linspace(0.1,0.8,7)
intervals = np.linspace(0.5,1.5,50)

def interpolate(a):
    return barycentric_interpolate(X, a, intervals)
start = time.process_time()
out = np.apply_along_axis(interpolate, 0, matrix)
print(time.process_time() - start)

与我机器上的 3.76 秒相比，returns 的时间约为 3.52 秒。请注意，此处的输出矩阵是 (50, 50, 1000)。要获得矩阵的维度，只需转置即可。

np.all(out.T == matrix_tot)

Answer 2

知道了。幸运的是，您可以在单个函数调用中提供多个 y 向量。它似乎是矢量化的，因为性能增益是 2 个数量级。

import numpy as np
from scipy.interpolate import barycentric_interpolate
import time

matrix = np.random.rand(7, 50, 1000)
X = np.linspace(0.1,0.8,7)
intervals = np.linspace(0.5,1.5,50)


start = time.process_time()
matrix_tot = []
for col in range(len(matrix[0][0])):
    matrix_i = []
    for row in range(len(matrix[0])):
        interp_1 = []
        for m in range(len(matrix)):
            values = matrix[m][row][col]
            interp_1.append(values)
        row_interpolated = barycentric_interpolate(X,np.array(interp_1),intervals)
        matrix_i.append(row_interpolated)
    matrix_tot.append(matrix_i)
matrix_tot = np.array(matrix_tot)
print('baseline: ', time.process_time() - start)

## ANSWER HERE:
start = time.process_time()
matrix_reshaped = matrix.reshape(7, 50 * 1000)
matrix_tot2 = barycentric_interpolate(X, matrix_reshaped, intervals).reshape(50, 50, 1000)
# this is only for comparison with matrix_tot, you may want to remove line below:
matrix_tot2 = matrix_tot2.transpose([2, 1, 0])
print('vectorised: ', time.process_time() - start)

assert np.allclose(matrix_tot, matrix_tot2, atol=1e-8)

结果：

baseline:  5.796875
vectorised:  0.015625

而且没有一个 for 循环 :)

使用 numpy 避免 3 个 for 循环以获得更高效的代码

Avoid 3 for loops with numpy to have more efficient code

python

performance

time

numpy