如何使用 geopanda 或 shapely 在同一地理数据框中找到最近的点

Question

我有一个地理数据框，显示约 25 个位置，表示为点几何。我试图想出一个脚本来遍历每个点，识别最近的位置并 returns 最近位置的名称和距离。

如果我在 shapely.ops 库中使用 nearest_points(geom1, geom2) 有不同的地理数据框，我可以很容易地做到这一点。但是，我所有的位置都存储在一个地理数据框中。我正在尝试循环，这就是我遇到问题的地方

这是我的示例文件：

geofile = gpd.GeoDataFrame([[0, 'location A', Point(55, 55)],
                            [1, 'location B', Point(66, 66)],
                            [2, 'Location C', Point(99, 99)],
                            [3, 'Location D', Point(11, 11)]],
                           columns=['ID','Location','geometry'])

这是我创建的循环，但没有用。

for index, row in geofile.iterrows():
    nearest_geoms=nearest_points(row, geofile)
    print('location:' + nearest_geoms[0])
    print('nearest:' + nearest_geoms[1])
    print('-------')

我收到这个错误：

AttributeError: 'Series' object has no attribute '_geom'

但是我认为我的问题超出了错误原因，因为不知何故我必须排除我正在循环的行，因为它会自动 return 作为最近的位置，因为它是那个位置。

我对一个地点的最终结果如下：

([0,'location A','location B', '5 miles', Point(55,55)], columns=['ID','Location','Nearest', 'Distance',geometry'])

Answer 1

Shapely 的 nearest_points 函数比较匀称的几何图形。要将单个 Point 几何图形与多个其他 Point 几何图形进行比较，您可以使用 .unary_union 与生成的 MultiPoint 几何图形进行比较。是的，在每一行操作中，删除相应的点，这样它就不会与自身进行比较。

import geopandas as gpd
from shapely.geometry import Point
from shapely.ops import nearest_points

df = gpd.GeoDataFrame([[0, 'location A', Point(55,55)], 
                       [1, 'location B', Point(66,66)],
                       [2, 'Location C', Point(99,99)],
                       [3, 'Location D' ,Point(11,11)]], 
                      columns=['ID','Location','geometry'])
df.insert(3, 'nearest_geometry', None)

for index, row in df.iterrows():
    point = row.geometry
    multipoint = df.drop(index, axis=0).geometry.unary_union
    queried_geom, nearest_geom = nearest_points(point, multipoint)
    df.loc[index, 'nearest_geometry'] = nearest_geom

导致

    ID  Location    geometry        nearest_geometry
0   0   location A  POINT (55 55)   POINT (66 66)
1   1   location B  POINT (66 66)   POINT (55 55)
2   2   Location C  POINT (99 99)   POINT (66 66)
3   3   Location D  POINT (11 11)   POINT (55 55)

Answer 2

这是另一种基于scipy.spatial.distance.cdist 的方法。 iterrows 通过使用 numpy 掩码数组来避免。

import geopandas as gpd
from scipy.spatial import distance
import numpy.ma as ma
from shapely.geometry import Point
import numpy as np

df = gpd.GeoDataFrame([[0, 'location A', Point(55,55)], 
                       [1, 'location B', Point(66,66)],
                       [2, 'Location C', Point(99,99)],
                       [3, 'Location D' ,Point(11,11)]], 
                      columns=['ID','Location','geometry'])

coords = np.stack(df.geometry.apply(lambda x: [x.x, x.y]))
distance_matrix = ma.masked_where((dist := distance.cdist(*[coords] * 2)) == 0, dist)
df["closest_ID"] = np.argmin(distance_matrix, axis=0)
df = df.join(df.set_index("ID").geometry.rename("nearest_geometry"), on="closest_ID")
df.drop("closest_ID", axis=1)

# Out:
   ID    Location               geometry           nearest_geometry
0   0  location A  POINT (55.000 55.000)  POINT (66.00000 66.00000)
1   1  location B  POINT (66.000 66.000)  POINT (55.00000 55.00000)
2   2  Location C  POINT (99.000 99.000)  POINT (66.00000 66.00000)
3   3  Location D  POINT (11.000 11.000)  POINT (55.00000 55.00000)

多个邻居的泛化

由于 distance_matrix 包含所有点对之间距离的完整信息，因此很容易将这种方法推广到任意数量的邻居。例如，如果我们有兴趣为每个点找到 N_NEAREST = 2 个邻居，我们可以对距离矩阵进行排序（使用 np.argsort，而不是像以前那样选择 np.argmin）和 select对应的列数：

nearest_id_cols = list(map("nearest_id_{}".format, range(1, N_NEAREST + 1)))
nearest_geom_cols = list(map("nearest_geometry_{}".format, range(1, N_NEAREST + 1)))
df[nearest_id_cols] = np.argsort(distance_matrix, axis=1)[:, :N_NEAREST]
df[nearest_geom_cols] = df[nearest_id_cols].applymap(
                             lambda x: df.set_index("ID").geometry[x])

# out:
   ID    Location                  geometry  nearest_id_1  nearest_id_2  \
0   0  location A  POINT (55.00000 55.00000)             1             2   
1   1  location B  POINT (66.00000 66.00000)             0             2   
2   2  Location C  POINT (99.00000 99.00000)             1             0   
3   3  Location D  POINT (11.00000 11.00000)             0             1   

  nearest_geometry_1 nearest_geometry_2  
0       POINT (66 66)       POINT (99 99)  
1       POINT (55 55)       POINT (99 99)  
2       POINT (66 66)       POINT (55 55)  
3       POINT (55 55)       POINT (66 66)

如何使用 geopanda 或 shapely 在同一地理数据框中找到最近的点

How to use geopanda or shapely to find nearest point in same geodataframe

python

gis

shapely

geopandas

多个邻居的泛化