测量同一 shapefile 中多边形之间距离的接近度(最小值、最大值、平均值)指标
Measuring metrics of proximity (min, max, mean) of distance between polygons in the same shapefile
我正在寻找一种方法来使用程序 R 测量同一平面投影形状文件 (UTM) 中多边形边缘之间的最小、最大和平均距离。是否有具有这些功能的软件包?到目前为止,我说得很短,包 "rgeos" 与 "gDistance" 最接近,但这似乎只是 return 多边形之间的最小距离。这将包括多边形的几种分散度量。例如,我这里有来自夏威夷州的岛屿。您可以下载美国各州的 shapefile here。然后我隔离并投影夏威夷州:
library(rgdal)
library(rgeos)
states = readOGR(dsn=path.expand(),layer = "states")
hawaii = states[states$STATE_NAME == "Hawaii", ]
hawaii = spTransform(hawaii, CRS("+proj=utm +zone=4 ellps=WGS84"))
我想测量夏威夷州内岛屿边缘之间的最大距离、岛屿边缘之间的最小距离以及岛屿边缘之间的平均距离。我已经尝试了 rgeo 的 gDistance,它应该 return 两条边之间的最小距离,但它目前对我来说 return 为零:
gDistance(hawaii)
有什么想法吗?使用 "byid" 调用仅 return 三个 0 尝试以下操作:
> gDistance(hawaii, hawaii, byid = TRUE)
0
0 0
我希望将这部分作为 for 循环的一部分,以评估每个文件中包含多个多边形的 ~200 个独立多边形 shapefile 的接近度指标。我只需要计算每个 shapefile 的多边形 within 的接近度,而不是跨不同的 shapefile。谢谢你的帮助。
首先,对于最小值,通常如果你运行 gDistance(hawaii, byID = TRUE)
,你会得到一个矩阵,显示数据集中每对岛(多边形)之间的最小距离。但是,正如评论中所讨论的那样,要使其发挥作用,每个岛屿都需要在多边形文件中拥有自己的 ID。
对于夏威夷和您指定的 shapefile,此方法适用于获取最小距离:
library(sp)
hawaii_out <- disaggregate(hawaii)
gDistance(hawaii_out,byid = T)
1 2 3 4 5 6 7
1 0.00 26246.85 189520.49 299489.75 333273.01 367584.38 475015.98
2 26246.85 0.00 117413.58 228699.22 263368.91 296349.18 406123.52
3 189520.49 117413.58 0.00 41995.90 76905.51 110099.62 219964.68
4 299489.75 228699.22 41995.90 0.00 13568.15 12738.74 129211.73
5 333273.01 263368.91 76905.51 13568.15 0.00 14052.47 115235.51
6 367584.38 296349.18 110099.62 12738.74 14052.47 0.00 46840.79
7 475015.98 406123.52 219964.68 129211.73 115235.51 46840.79 0.00
(当然要看哪个ID对应哪个岛)
这个 运行 对于夏威夷和那个 shapefile 来说相当快,但是如果岛屿(多边形)的数量很多,这个函数很容易花费很长时间。
编辑: 添加一种可以识别和测量岛对最极端点的方法
library(leaflet)
library(leaflet.extras)
library(sp)
library(tidyr)
library(dplyr)
start <- Sys.time()
##Need the original long/lat shapefile for this
hawaii_ll = states[states$STATE_NAME == "Hawaii", ]
hawaii_out_ll <- disaggregate(hawaii_ll) ##Separate the islands
##Exact the original coordinates from the polygon file. Since we're dealing with straight line polygons, the furthest distance between any two islands should lie on a vertice.
IslandCoords <- list()
for(i in rownames(hawaii_out_ll@data)){
IslandCoords <- hawaii_out_ll@polygons[[as.numeric(i)]] %>%
slot("Polygons") %>%
.[[1]] %>%
slot("coords") %>%
cbind.data.frame(.,"Island"=i,"coord_ID"=paste0(i,"_",1:nrow(.)),stringsAsFactors=F) %>%
bind_rows(IslandCoords,.)
}
colnames(IslandCoords)[1:2] <- c("longitude","latitude")
##Double for loop will calculate max distance for each pair of islands in the dataset
all_res <- list() ##Initialise list for final results
for (island1 in unique(IslandCoords$Island)){
temp_res <- list() ##List for temp results
for (island2 in unique(IslandCoords$Island)) {
if(island1!=island2){ ##Avoid running the loop on the same island
##subset points to a single pair of islands
IslandCoordsTemp <- IslandCoords[IslandCoords$Island==island1|
IslandCoords$Island==island2,]
## Derive the convex hull (outermost points) for the island pair
IslandCoordsTemp <- IslandCoordsTemp[chull(IslandCoordsTemp[,1:2]),]
##Calculate distance matrix between points, tidy it and order by distance
IslandTemp_scores <- spDists(as.matrix(IslandCoordsTemp[,1:2]),longlat=T) %>%
data.frame("coord_ID_A"=IslandCoordsTemp$coord_ID,
"Island_A"=IslandCoordsTemp$Island,
.) %>%
gather("coord_ID_B","distance",3:ncol(.)) %>%
arrange(desc(distance))
##Next two lines are to check and filter the data to ensure the maximum distance picked out is actually between points on differing islands
IslandTemp_scores$coord_ID_B <- IslandCoordsTemp$coord_ID[as.numeric(gsub("X","",IslandTemp_scores$coord_ID_B))]
IslandTemp_scores$Island_B <- IslandCoordsTemp$Island[match(IslandTemp_scores$coord_ID_B,IslandCoordsTemp$coord_ID)]
IslandTemp_scores <- IslandTemp_scores %>%
filter(IslandTemp_scores$Island_A != IslandTemp_scores$Island_B) %>%
head(1)
##Place results in temp list
temp_res <- bind_rows(temp_res,
data.frame("Island1"=island1,
"Island2"=island2,
"distance"=IslandTemp_scores$distance,
stringsAsFactors = F))
##Use this to make sure the code is running as expected
print(paste(island1,island2))
}
}
##Bind all results into one data frame
all_res <- bind_rows(all_res,temp_res)
}
##Spread into matrix (if needed, just to match gDistance's appearance)
all_res_spread <- all_res %>% spread(key = Island2,value = distance,fill = 0)
单位为公里。
Island1 1 2 3 4 5 6 7
1 1 0.0000 104.1285 272.4133 372.96831 374.27478 457.4984 624.7161
2 2 104.1285 0.0000 235.0730 334.42077 338.90971 420.2209 592.3716
3 3 272.4133 235.0730 0.0000 168.24874 174.68062 254.1973 430.2157
4 4 372.9683 334.4208 168.2487 0.00000 65.76585 143.4336 319.7396
5 5 374.2748 338.9097 174.6806 65.76585 0.00000 112.0591 283.6706
6 6 457.4984 420.2209 254.1973 143.43355 112.05911 0.0000 258.1099
7 7 624.7161 592.3716 430.2157 319.73960 283.67057 258.1099 0.0000
您可以使用 leaflet 和 leaflet.extras 中的 addMeasures
插件来检测结果。
##Can use this to sense-check/confirm the results
leaflet(hawaii_out_ll) %>% addPolygons(label = row.names(hawaii_out_ll@data)) %>%
addProviderTiles(providers$CartoDB) %>% addMeasure(primaryLengthUnit = "metre") %>%
addMarkers(data=IslandCoordsTemp)
我正在寻找一种方法来使用程序 R 测量同一平面投影形状文件 (UTM) 中多边形边缘之间的最小、最大和平均距离。是否有具有这些功能的软件包?到目前为止,我说得很短,包 "rgeos" 与 "gDistance" 最接近,但这似乎只是 return 多边形之间的最小距离。这将包括多边形的几种分散度量。例如,我这里有来自夏威夷州的岛屿。您可以下载美国各州的 shapefile here。然后我隔离并投影夏威夷州:
library(rgdal)
library(rgeos)
states = readOGR(dsn=path.expand(),layer = "states")
hawaii = states[states$STATE_NAME == "Hawaii", ]
hawaii = spTransform(hawaii, CRS("+proj=utm +zone=4 ellps=WGS84"))
我想测量夏威夷州内岛屿边缘之间的最大距离、岛屿边缘之间的最小距离以及岛屿边缘之间的平均距离。我已经尝试了 rgeo 的 gDistance,它应该 return 两条边之间的最小距离,但它目前对我来说 return 为零:
gDistance(hawaii)
有什么想法吗?使用 "byid" 调用仅 return 三个 0 尝试以下操作:
> gDistance(hawaii, hawaii, byid = TRUE)
0
0 0
我希望将这部分作为 for 循环的一部分,以评估每个文件中包含多个多边形的 ~200 个独立多边形 shapefile 的接近度指标。我只需要计算每个 shapefile 的多边形 within 的接近度,而不是跨不同的 shapefile。谢谢你的帮助。
首先,对于最小值,通常如果你运行 gDistance(hawaii, byID = TRUE)
,你会得到一个矩阵,显示数据集中每对岛(多边形)之间的最小距离。但是,正如评论中所讨论的那样,要使其发挥作用,每个岛屿都需要在多边形文件中拥有自己的 ID。
对于夏威夷和您指定的 shapefile,此方法适用于获取最小距离:
library(sp)
hawaii_out <- disaggregate(hawaii)
gDistance(hawaii_out,byid = T)
1 2 3 4 5 6 7
1 0.00 26246.85 189520.49 299489.75 333273.01 367584.38 475015.98
2 26246.85 0.00 117413.58 228699.22 263368.91 296349.18 406123.52
3 189520.49 117413.58 0.00 41995.90 76905.51 110099.62 219964.68
4 299489.75 228699.22 41995.90 0.00 13568.15 12738.74 129211.73
5 333273.01 263368.91 76905.51 13568.15 0.00 14052.47 115235.51
6 367584.38 296349.18 110099.62 12738.74 14052.47 0.00 46840.79
7 475015.98 406123.52 219964.68 129211.73 115235.51 46840.79 0.00
(当然要看哪个ID对应哪个岛)
这个 运行 对于夏威夷和那个 shapefile 来说相当快,但是如果岛屿(多边形)的数量很多,这个函数很容易花费很长时间。
编辑: 添加一种可以识别和测量岛对最极端点的方法
library(leaflet)
library(leaflet.extras)
library(sp)
library(tidyr)
library(dplyr)
start <- Sys.time()
##Need the original long/lat shapefile for this
hawaii_ll = states[states$STATE_NAME == "Hawaii", ]
hawaii_out_ll <- disaggregate(hawaii_ll) ##Separate the islands
##Exact the original coordinates from the polygon file. Since we're dealing with straight line polygons, the furthest distance between any two islands should lie on a vertice.
IslandCoords <- list()
for(i in rownames(hawaii_out_ll@data)){
IslandCoords <- hawaii_out_ll@polygons[[as.numeric(i)]] %>%
slot("Polygons") %>%
.[[1]] %>%
slot("coords") %>%
cbind.data.frame(.,"Island"=i,"coord_ID"=paste0(i,"_",1:nrow(.)),stringsAsFactors=F) %>%
bind_rows(IslandCoords,.)
}
colnames(IslandCoords)[1:2] <- c("longitude","latitude")
##Double for loop will calculate max distance for each pair of islands in the dataset
all_res <- list() ##Initialise list for final results
for (island1 in unique(IslandCoords$Island)){
temp_res <- list() ##List for temp results
for (island2 in unique(IslandCoords$Island)) {
if(island1!=island2){ ##Avoid running the loop on the same island
##subset points to a single pair of islands
IslandCoordsTemp <- IslandCoords[IslandCoords$Island==island1|
IslandCoords$Island==island2,]
## Derive the convex hull (outermost points) for the island pair
IslandCoordsTemp <- IslandCoordsTemp[chull(IslandCoordsTemp[,1:2]),]
##Calculate distance matrix between points, tidy it and order by distance
IslandTemp_scores <- spDists(as.matrix(IslandCoordsTemp[,1:2]),longlat=T) %>%
data.frame("coord_ID_A"=IslandCoordsTemp$coord_ID,
"Island_A"=IslandCoordsTemp$Island,
.) %>%
gather("coord_ID_B","distance",3:ncol(.)) %>%
arrange(desc(distance))
##Next two lines are to check and filter the data to ensure the maximum distance picked out is actually between points on differing islands
IslandTemp_scores$coord_ID_B <- IslandCoordsTemp$coord_ID[as.numeric(gsub("X","",IslandTemp_scores$coord_ID_B))]
IslandTemp_scores$Island_B <- IslandCoordsTemp$Island[match(IslandTemp_scores$coord_ID_B,IslandCoordsTemp$coord_ID)]
IslandTemp_scores <- IslandTemp_scores %>%
filter(IslandTemp_scores$Island_A != IslandTemp_scores$Island_B) %>%
head(1)
##Place results in temp list
temp_res <- bind_rows(temp_res,
data.frame("Island1"=island1,
"Island2"=island2,
"distance"=IslandTemp_scores$distance,
stringsAsFactors = F))
##Use this to make sure the code is running as expected
print(paste(island1,island2))
}
}
##Bind all results into one data frame
all_res <- bind_rows(all_res,temp_res)
}
##Spread into matrix (if needed, just to match gDistance's appearance)
all_res_spread <- all_res %>% spread(key = Island2,value = distance,fill = 0)
单位为公里。
Island1 1 2 3 4 5 6 7
1 1 0.0000 104.1285 272.4133 372.96831 374.27478 457.4984 624.7161
2 2 104.1285 0.0000 235.0730 334.42077 338.90971 420.2209 592.3716
3 3 272.4133 235.0730 0.0000 168.24874 174.68062 254.1973 430.2157
4 4 372.9683 334.4208 168.2487 0.00000 65.76585 143.4336 319.7396
5 5 374.2748 338.9097 174.6806 65.76585 0.00000 112.0591 283.6706
6 6 457.4984 420.2209 254.1973 143.43355 112.05911 0.0000 258.1099
7 7 624.7161 592.3716 430.2157 319.73960 283.67057 258.1099 0.0000
您可以使用 leaflet 和 leaflet.extras 中的 addMeasures
插件来检测结果。
##Can use this to sense-check/confirm the results
leaflet(hawaii_out_ll) %>% addPolygons(label = row.names(hawaii_out_ll@data)) %>%
addProviderTiles(providers$CartoDB) %>% addMeasure(primaryLengthUnit = "metre") %>%
addMarkers(data=IslandCoordsTemp)