计算数据帧的每一行与另一个数据帧中所有其他行之间的欧氏距离，但输出应该是哪一行

Question

我的情况和这个类似： calculating the euclidean dist between each row of a dataframe with all other rows in another dataframe

所以，我有两个数据框，x 和 y。我想计算 x 的每一行和 y 的每一行之间的欧氏距离，但我有兴趣为 x 的每一行得到 y 的哪一行具有最小距离，因为我想根据到行的距离对 x 的行进行聚类y（x 有 10 行，y 有 4 行）。所以我的输出应该是这样的： 1 2 2 4 3 3 2 2 1 4 也就是说，x 的第一行最接近 y 的第一行，依此类推。我正在为 kmeans 聚类编写算法。我是 R 的新手，需要一些帮助。谢谢

Answer 1

您发布的问题的公认解决方案是：

unlist(lapply(seq_len(nrow(y)), function(i) min(sqrt(colSums((y[i, ] - t(x))^2)))))

要更改此设置以获取索引而不是值，请将 min 更改为 which.min，这应该 return 您正在寻找的输出：

unlist(lapply(seq_len(nrow(y)), function(i) which.min(sqrt(colSums((y[i, ] - t(x))^2)))))

计算数据帧的每一行与另一个数据帧中所有其他行之间的欧氏距离，但输出应该是哪一行

calculating the euclidean dist between each row of a dataframe with all other rows in another dataframe, but out put should be which row

r

euclidean-distance