最近邻向量化

Vectorisation of nearest neighbour

我正在寻找一种方法来提高我的简单最近邻函数的性能,但在使用 numpy 进行矢量化方面我不是很精通。如有任何帮助,我们将不胜感激!

def knn_search(pts_a, pts_b, k):
    """
    Finds the k nearest neighbours of each point in pts_a in pts_b
    :param pts_a:
    :param pts_b:
    :param k:
    :return dist, idx:
    """

    dist = np.empty((pts_b.shape[0], pts_a.shape[0]))
    for i in range(pts_b.shape[0]):
        dist[i, :] = np.linalg.norm(pts_a - pts_b[i, :], axis=1)

    idx = np.argsort(dist, axis=1)
    dist = np.sort(dist, axis=1)

    return dist[:, :k], idx[:, :k]


a = np.random.rand(10, 2)
b = np.random.rand(10, 2)

distance, indices = knn_search(a, b, 5)

您可以使用广播将循环替换为外部差异:

def knn_search(pts_a, pts_b, k):
    """
    Finds the k nearest neighbours of each point in pts_a in pts_b
    :param pts_a:
    :param pts_b:
    :param k:
    :return dist, idx:
    """

    dist = np.linalg.norm(pts_a - pts_b[:, None], axis=-1)
    idx = np.argsort(dist, axis=1)
    dist = np.sort(dist, axis=1)

    return dist[:, :k], idx[:, :k]