Matlab计算数组中所有（u，v）向量的最近邻距离

Question

我正在尝试计算 nx2 矩阵中最近邻居之间的距离，如下所示

point_coordinates =

   11.4179  103.1400
   16.7710   10.6691
   16.6068  119.7024
   25.1379   74.3382
   30.3651   23.2635
   31.7231  105.9109
   31.8653   36.9388





%for loop going from the top of the vector column to the bottom
for counter = 1:size(point_coordinates,1) 
    %current point defined selected 
    current_point = point_coordinates(counter,:);

    %math to calculate distance between the current point and all the points 
    distance_search= point_coordinates-repmat(current_point,[size(point_coordinates,1) 1]);
    dist_from_current_point = sqrt(distance_search(:,1).^2+distance_search(:,2).^2);

    %line to omit self subtraction that gives zero
    dist_from_current_point (dist_from_current_point <= 0)=[];

    %gives the shortest distance calculated for a certain vector and current_point
    nearest_dist=min(dist_from_current_point);

end

%final line to plot the u,v vectors and the corresponding nearest neighbour
%distances
matnndist = [point_coordinates nearest_dist]

我不确定如何构建 'for' loop/nearest_neighbour 行以便能够获得每个 u,v 向量的最近邻距离。

我想要，例如；对于第一个向量，你可以有坐标和相应的最短距离，对于第二个向量，另一个是它的最短距离，这一直持续到 n

希望有人能提供帮助。

谢谢

Answer 1

我知道你想获得不同点之间的最小距离。

您可以使用 bsxfun 计算每对点的距离；消除自我距离；最小化。使用平方距离计算效率更高，并且只在最后取平方根。

n = size(point_coordinates,1);
dist = bsxfun(@minus, point_coordinates(:,1), point_coordinates(:,1).').^2 + ...
       bsxfun(@minus, point_coordinates(:,2), point_coordinates(:,2).').^2;
dist(1:n+1:end) = inf; %// remove self-distances
min_dist = sqrt(min(dist(:)));

或者，您可以使用 pdist。这样可以避免每个距离计算两次，也可以避免自距离：

dist = pdist(point_coordinates);
min_dist = min(dist(:));

Answer 2

如果我可以建议内置函数，请使用 knnsearch from the statistics toolbox. What you are essentially doing is a K-Nearest Neighbour (KNN) 算法，但您忽略了自距离。您调用 knnsearch 的方式如下：

[idx,d] = knnsearch(X, Y, 'k', k);

简单来说，KNN 算法returns k 最近指向给定查询点的数据集。通常，欧氏距离是所使用的距离度量。对于 MATLAB 的 knnsearch，X 是一个二维数组，由您的数据集组成，其中每个行是一个观察值，每个列是一个变量。 Y 将是查询点。 Y 也是一个二维数组，其中每个行是一个查询点，您需要具有与 X 相同的列数。我们还将指定标志 'k' 来表示您想要返回多少个最近的点。默认情况下，k = 1.

因此，idx 将是一个 N x K 矩阵，其中 N 是查询点的总数（Y 的行数）和 K 将是那些 k 最接近我们拥有的每个查询点的数据集的点。 idx 表示数据集中最接近每个查询的特定点。 d也是一个N x K矩阵，returns这些对应的最近点的最小距离。

因此，您要做的是为您的数据集找到与其他每个点最近的点，忽略自身距离。因此，您可以将 X 和 Y 设置为相同，并设置 k = 2，丢弃两个输出的第一列以获得您要查找的结果。

因此：

[idx,d] = knnsearch(point_coordinates, point_coordinates, 'k', 2)
idx = idx(:,2);
d = d(:,2);

我们因此得到 idx 和 d:

因此，这告诉我们，对于数据集中的第一个点，它与点 #3 的匹配度最高。这与最近的距离 17.3562 相匹配。对于数据集中的第二个点，它与点 #5 最匹配，最近的距离为 18.5316。您可以以类似的模式继续处理其余结果。

如果您无法访问统计工具箱，请考虑阅读我的 Whosebug post，了解我如何根据第一原理计算 KNN。

Finding K-nearest neighbors and its implementation

其实和你之前很像。

祝你好运！

Matlab计算数组中所有（u，v）向量的最近邻距离

Matlab calculating nearest neighbour distance for all (u, v) vectors in an array

matlab

for-loop

vector

distance

matrix