测量同一 shapefile 中多边形之间距离的接近度(最小值、最大值、平均值)指标

Measuring metrics of proximity (min, max, mean) of distance between polygons in the same shapefile

我正在寻找一种方法来使用程序 R 测量同一平面投影形状文件 (UTM) 中多边形边缘之间的最小、最大和平均距离。是否有具有这些功能的软件包?到目前为止,我说得很短,包 "rgeos" 与 "gDistance" 最接近,但这似乎只是 return 多边形之间的最小距离。这将包括多边形的几种分散度量。例如,我这里有来自夏威夷州的岛屿。您可以下载美国各州的 shapefile here。然后我隔离并投影夏威夷州:

library(rgdal)
library(rgeos)

states = readOGR(dsn=path.expand(),layer = "states")
hawaii = states[states$STATE_NAME == "Hawaii", ] 
hawaii = spTransform(hawaii, CRS("+proj=utm +zone=4 ellps=WGS84"))

我想测量夏威夷州内岛屿边缘之间的最大距离、岛屿边缘之间的最小距离以及岛屿边缘之间的平均距离。我已经尝试了 rgeo 的 gDistance,它应该 return 两条边之间的最小距离,但它目前对我来说 return 为零:

gDistance(hawaii)

有什么想法吗?使用 "byid" 调用仅 return 三个 0 尝试以下操作:

> gDistance(hawaii, hawaii, byid = TRUE)
  0
0 0

我希望将这部分作为 for 循环的一部分,以评估每个文件中包含多个多边形的 ~200 个独立多边形 shapefile 的接近度指标。我只需要计算每个 shapefile 的多边形 within 的接近度,而不是跨不同的 shapefile。谢谢你的帮助。

首先,对于最小值,通常如果你运行 gDistance(hawaii, byID = TRUE),你会得到一个矩阵,显示数据集中每对岛(多边形)之间的最小距离。但是,正如评论中所讨论的那样,要使其发挥作用,每个岛屿都需要在多边形文件中拥有自己的 ID。

对于夏威夷和您指定的 shapefile,此方法适用于获取最小距离:

library(sp)    
hawaii_out <- disaggregate(hawaii)
gDistance(hawaii_out,byid = T)
              1         2         3         4         5         6         7
    1      0.00  26246.85 189520.49 299489.75 333273.01 367584.38 475015.98
    2  26246.85      0.00 117413.58 228699.22 263368.91 296349.18 406123.52
    3 189520.49 117413.58      0.00  41995.90  76905.51 110099.62 219964.68
    4 299489.75 228699.22  41995.90      0.00  13568.15  12738.74 129211.73
    5 333273.01 263368.91  76905.51  13568.15      0.00  14052.47 115235.51
    6 367584.38 296349.18 110099.62  12738.74  14052.47      0.00  46840.79
    7 475015.98 406123.52 219964.68 129211.73 115235.51  46840.79      0.00

(当然要看哪个ID对应哪个岛)

这个 运行 对于夏威夷和那个 shapefile 来说相当快,但是如果岛屿(多边形)的数量很多,这个函数很容易花费很长时间。

编辑: 添加一种可以识别和测量岛对最极端点的方法

library(leaflet)
library(leaflet.extras)
library(sp)
library(tidyr)
library(dplyr)

start <- Sys.time()
##Need the original long/lat shapefile for this
hawaii_ll = states[states$STATE_NAME == "Hawaii", ] 
hawaii_out_ll <- disaggregate(hawaii_ll) ##Separate the islands

##Exact the original coordinates from the polygon file. Since we're dealing with straight line polygons, the furthest distance between any two islands should lie on a vertice.
IslandCoords <- list()
for(i in rownames(hawaii_out_ll@data)){
IslandCoords <- hawaii_out_ll@polygons[[as.numeric(i)]] %>% 
  slot("Polygons") %>% 
  .[[1]] %>% 
  slot("coords") %>% 
  cbind.data.frame(.,"Island"=i,"coord_ID"=paste0(i,"_",1:nrow(.)),stringsAsFactors=F) %>% 
  bind_rows(IslandCoords,.)
}
colnames(IslandCoords)[1:2] <- c("longitude","latitude")

##Double for loop will calculate max distance for each pair of islands in the dataset
all_res <- list() ##Initialise list for final results
for (island1 in unique(IslandCoords$Island)){
  temp_res <- list() ##List for temp results
for (island2 in unique(IslandCoords$Island)) {
  if(island1!=island2){   ##Avoid running the loop on the same island
##subset points to a single pair of islands
IslandCoordsTemp <- IslandCoords[IslandCoords$Island==island1|
                                   IslandCoords$Island==island2,]  
## Derive the convex hull (outermost points) for the island pair
IslandCoordsTemp <- IslandCoordsTemp[chull(IslandCoordsTemp[,1:2]),] 

##Calculate distance matrix between points, tidy it and order by distance
IslandTemp_scores <- spDists(as.matrix(IslandCoordsTemp[,1:2]),longlat=T) %>% 
  data.frame("coord_ID_A"=IslandCoordsTemp$coord_ID,
             "Island_A"=IslandCoordsTemp$Island,
             .) %>% 
  gather("coord_ID_B","distance",3:ncol(.)) %>% 
  arrange(desc(distance))

##Next two lines are to check and filter the data to ensure the maximum distance picked out is actually between points on differing islands
IslandTemp_scores$coord_ID_B <- IslandCoordsTemp$coord_ID[as.numeric(gsub("X","",IslandTemp_scores$coord_ID_B))] 
IslandTemp_scores$Island_B <- IslandCoordsTemp$Island[match(IslandTemp_scores$coord_ID_B,IslandCoordsTemp$coord_ID)]
IslandTemp_scores <- IslandTemp_scores %>% 
  filter(IslandTemp_scores$Island_A != IslandTemp_scores$Island_B) %>% 
  head(1)

##Place results in temp list
temp_res <- bind_rows(temp_res, 
            data.frame("Island1"=island1, 
           "Island2"=island2,
           "distance"=IslandTemp_scores$distance,
           stringsAsFactors = F))

##Use this to make sure the code is running as expected
print(paste(island1,island2))
}
}
  ##Bind all results into one data frame
  all_res <- bind_rows(all_res,temp_res)
 }

##Spread into matrix (if needed, just to match gDistance's appearance)
all_res_spread <- all_res %>% spread(key = Island2,value = distance,fill = 0)

单位为公里。

  Island1        1        2        3         4         5        6        7
1       1   0.0000 104.1285 272.4133 372.96831 374.27478 457.4984 624.7161
2       2 104.1285   0.0000 235.0730 334.42077 338.90971 420.2209 592.3716
3       3 272.4133 235.0730   0.0000 168.24874 174.68062 254.1973 430.2157
4       4 372.9683 334.4208 168.2487   0.00000  65.76585 143.4336 319.7396
5       5 374.2748 338.9097 174.6806  65.76585   0.00000 112.0591 283.6706
6       6 457.4984 420.2209 254.1973 143.43355 112.05911   0.0000 258.1099
7       7 624.7161 592.3716 430.2157 319.73960 283.67057 258.1099   0.0000

您可以使用 leaflet 和 leaflet.extras 中的 addMeasures 插件来检测结果。

##Can use this to sense-check/confirm the results
leaflet(hawaii_out_ll) %>% addPolygons(label = row.names(hawaii_out_ll@data)) %>% 
  addProviderTiles(providers$CartoDB) %>% addMeasure(primaryLengthUnit = "metre") %>% 
  addMarkers(data=IslandCoordsTemp)