Matlab:帮助找到最小距离

Matlab : Help in finding minimum distance

我试图找到与候选集距离最小的点。 Z 是一个矩阵,其中行是维度,列表示点。计算点间距离,然后记录距离最小的点及其距离。下面是代码片段。该代码适用于小尺寸和小点集。但是,对于大数据集(N = 100 万个数据点,维度也很高)需要很长时间。有没有有效的方法?

我建议您使用 pdist to do the heavy lifting for you. This function will compute the pairwise distance between every two points in your array. The resulting vector has to be put into matrix form using squareform 来找到每对的最小值:

N = 100;
Z = rand(2,N);  % each column is a 2-dimensional point

% pdist assumes that the second index corresponds to dimensions
% so we need to transpose inside pdist()
distmatrix = squareform(pdist(Z.','euclidean')); % output is [N, N] in size
% set diagonal values to infinity to avoid getting 0 self-distance as minimum
distmatrix = distmatrix + diag(inf(1,size(distmatrix,1)));
mindists = min(distmatrix,[],2); % find the minimum for each row
sum_dist = sum(mindists); % sum of minimal distance between each pair of points

这会计算每对两次,但我认为这对于您的原始实现是正确的。

想法是 pdist 计算其输入列之间的成对距离。所以我们把Z转置成pdist。由于完整输出始终是一个对角线为零的方阵,因此 pdist 的实现使得它仅 returns 向量中对角线上方的值。因此需要调用 squareform 来获得正确的距离矩阵。然后,必须找到该矩阵的行最小值,但首先我们必须排除对角线中的零。我很懒,所以我把 inf 放在对角线上,以确保最小值在其他地方。最后我们只需要总结最小距离。