计算 R 中的累积距离矩阵
Calculate matrix of cumulative distances in R
我需要一种有效的方法来计算一系列点之间的距离矩阵。要注意的是,您只能通过中间的所有点 'j' 才能从点 'i' 到达点 'k'。例如,想象一个有 5 个海滩的岛屿,您想要计算沿海岸线的所有海滩之间的距离,因为您无法穿过岛屿(包括两个方向:顺时针或逆时针)。
下面是一些示例数据。 (注意:您需要安装软件包 'geosphere' 才能使用 'distm' 函数,该函数计算沿地球表面的 GPS 坐标之间的距离)
library("geosphere")
longitude = c(-119.003, -119.067, -119.121, -119.089, -119.003)
latitude = c(33.503, 33.539, 33.485, 33.413, 33.440)
long.lat.mat = as.matrix(cbind(longitude, latitude))
# Use "distm" to calculate Euclidean (straight-line) distances between sites (in km)
euclid.dist.mat = distm(long.lat.mat) / 1000
# Create an empty matrix of alongshore distances (from "rows" to "columns")
alongshore.dist.mat = matrix(ncol=dim(long.lat.mat)[1], nrow=dim(long.lat.mat)[1], data=NA)
# Diagonal is zero. Adjacent sites are the same as Euclidean distance
diag(alongshore.dist.mat) = 0
diag(alongshore.dist.mat[,-1]) = diag(euclid.dist.mat[,-1])
alongshore.dist.mat[1,dim(long.lat.mat)[1]] = euclid.dist.mat[1,dim(long.lat.mat)[1]]
alongshore.dist.mat[lower.tri(alongshore.dist.mat)] = t(alongshore.dist.mat)[lower.tri(t(alongshore.dist.mat))]
# > alongshore.dist.mat
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0.0000000 7.1650632 NA NA 7.0131279
# [2,] 7.1650632 0.0000000 7.8265783 NA NA
# [3,] NA 7.8265783 0.0000000 8.5483605 NA
# [4,] NA NA 8.5483605 0.0000000 8.5365807
# [5,] 7.0131279 NA NA 8.5365807 0.0000000
现在,如何填写剩余的单元格?例如:
alongshore.dist.mat[1,3] = 7.1650632 + 7.8265783 = 14.991642
...代表站点 1 -> 站点 2 -> 站点 3。相比之下:
alongshore.dist.mat[3,1] = 8.5483605 + 8.5365807 + 7.0131279 = 24.098069
...表示站点 3 -> 站点 4 -> 站点 5 -> 站点 1。
我怀疑 "cumsum" 功能可以有效使用,但不确定具体如何设置。我希望有一个避免 for 循环的解决方案,因为我实际上正在处理包含几十个点的数据。
您可以首先构建两个位置之间所有边的矩阵:
dists <- expand.grid(x=1:5, y=1:5)
dists$weight <- alongshore.dist.mat[as.matrix(dists)]
dists <- subset(dists, x != y & !is.na(weight))
dists
# x y weight
# 2 2 1 7.165063
# 5 5 1 7.013128
# 6 1 2 7.165063
# 8 3 2 7.826578
# 12 2 3 7.826578
# 14 4 3 8.548360
# 18 3 4 8.548360
# 20 5 4 8.536581
# 21 1 5 7.013128
# 24 4 5 8.536581
现在您可以构建图形并计算所有对的最短路径:
library(igraph)
g <- graph.data.frame(dists, vertices=data.frame(x=1:5))
shortest.paths(g)
# 1 2 3 4 5
# 1 0.000000 7.165063 14.991642 15.549709 7.013128
# 2 7.165063 0.000000 7.826578 16.374939 14.178191
# 3 14.991642 7.826578 0.000000 8.548360 17.084941
# 4 15.549709 16.374939 8.548360 0.000000 8.536581
# 5 7.013128 14.178191 17.084941 8.536581 0.000000
我需要一种有效的方法来计算一系列点之间的距离矩阵。要注意的是,您只能通过中间的所有点 'j' 才能从点 'i' 到达点 'k'。例如,想象一个有 5 个海滩的岛屿,您想要计算沿海岸线的所有海滩之间的距离,因为您无法穿过岛屿(包括两个方向:顺时针或逆时针)。
下面是一些示例数据。 (注意:您需要安装软件包 'geosphere' 才能使用 'distm' 函数,该函数计算沿地球表面的 GPS 坐标之间的距离)
library("geosphere")
longitude = c(-119.003, -119.067, -119.121, -119.089, -119.003)
latitude = c(33.503, 33.539, 33.485, 33.413, 33.440)
long.lat.mat = as.matrix(cbind(longitude, latitude))
# Use "distm" to calculate Euclidean (straight-line) distances between sites (in km)
euclid.dist.mat = distm(long.lat.mat) / 1000
# Create an empty matrix of alongshore distances (from "rows" to "columns")
alongshore.dist.mat = matrix(ncol=dim(long.lat.mat)[1], nrow=dim(long.lat.mat)[1], data=NA)
# Diagonal is zero. Adjacent sites are the same as Euclidean distance
diag(alongshore.dist.mat) = 0
diag(alongshore.dist.mat[,-1]) = diag(euclid.dist.mat[,-1])
alongshore.dist.mat[1,dim(long.lat.mat)[1]] = euclid.dist.mat[1,dim(long.lat.mat)[1]]
alongshore.dist.mat[lower.tri(alongshore.dist.mat)] = t(alongshore.dist.mat)[lower.tri(t(alongshore.dist.mat))]
# > alongshore.dist.mat
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0.0000000 7.1650632 NA NA 7.0131279
# [2,] 7.1650632 0.0000000 7.8265783 NA NA
# [3,] NA 7.8265783 0.0000000 8.5483605 NA
# [4,] NA NA 8.5483605 0.0000000 8.5365807
# [5,] 7.0131279 NA NA 8.5365807 0.0000000
现在,如何填写剩余的单元格?例如:
alongshore.dist.mat[1,3] = 7.1650632 + 7.8265783 = 14.991642
...代表站点 1 -> 站点 2 -> 站点 3。相比之下:
alongshore.dist.mat[3,1] = 8.5483605 + 8.5365807 + 7.0131279 = 24.098069
...表示站点 3 -> 站点 4 -> 站点 5 -> 站点 1。
我怀疑 "cumsum" 功能可以有效使用,但不确定具体如何设置。我希望有一个避免 for 循环的解决方案,因为我实际上正在处理包含几十个点的数据。
您可以首先构建两个位置之间所有边的矩阵:
dists <- expand.grid(x=1:5, y=1:5)
dists$weight <- alongshore.dist.mat[as.matrix(dists)]
dists <- subset(dists, x != y & !is.na(weight))
dists
# x y weight
# 2 2 1 7.165063
# 5 5 1 7.013128
# 6 1 2 7.165063
# 8 3 2 7.826578
# 12 2 3 7.826578
# 14 4 3 8.548360
# 18 3 4 8.548360
# 20 5 4 8.536581
# 21 1 5 7.013128
# 24 4 5 8.536581
现在您可以构建图形并计算所有对的最短路径:
library(igraph)
g <- graph.data.frame(dists, vertices=data.frame(x=1:5))
shortest.paths(g)
# 1 2 3 4 5
# 1 0.000000 7.165063 14.991642 15.549709 7.013128
# 2 7.165063 0.000000 7.826578 16.374939 14.178191
# 3 14.991642 7.826578 0.000000 8.548360 17.084941
# 4 15.549709 16.374939 8.548360 0.000000 8.536581
# 5 7.013128 14.178191 17.084941 8.536581 0.000000