Python merge/join 几何条件下的两个数据框

Python merge/join two dataframe by geometric condition

我有一个点 df_points (~50k) 的 GeoDataFrame 和一个多边形 df_polygons (~50k) 的 GeoDataFrame。

我希望通过保留 df_points 中的列来合并 2 个数据帧,并根据点是否存在于多边形中来匹配 df_polygons 中的列。

import geopandas as gpd
from shapely.geometry import Point, Polygon

_polygons = [ Polygon([(5, 5), (5, 13), (13, 13), (13, 5)]), Polygon([(10, 10), (10, 15), (15, 15), (15, 10)]) ]
_pnts = [Point(3, 3), Point(8, 8), Point(11, 11)]
df_polygons = gpd.GeoDataFrame(geometry=_polygons, index=['foo', 'bar']).reset_index()
df_points = gpd.GeoDataFrame(geometry=_pnts, index=['A', 'B', 'C']).reset_index()

df_points 看起来像:

> df_points
    index   geometry
0   A       POINT (3.00000 3.00000)
1   B       POINT (8.00000 8.00000)
2   C       POINT (11.00000 11.00000)

df_polygons 看起来像:

> df_polygons
    index   geometry
0   foo     POLYGON ((5.00000 5.00000, 5.00000 13.00000, 1...
1   bar     POLYGON ((10.00000 10.00000, 10.00000 15.00000...

结果可能如下所示:

    index   geometry_points            geometry_index   geometry_polygons
0   A       POINT (3.00000 3.00000)    []               []
1   B       POINT (8.00000 8.00000)    ['foo']          [Polygon([(5, 5), (5, 13), (13, 13), (13, 5)])]
2   C       POINT (11.00000 11.00000)  ['foo','bar']    [Polygon([(5, 5), (5, 13), (13, 13), (13, 5)]), Polygon([(10, 10), (10, 15), (15, 15), (15, 10)]]

有没有有效合并数据框的方法?

使用spatial join (gpd.sjoin):

# Rename 'index' columns to avoid FutureWarning
dfp = df_points.rename(columns={'index': 'point'})
dfa = df_polygons.rename(columns={'index': 'area'})

# Find points within polygons
out = gpd.sjoin(dfp, dfa, how='inner', op='within')

# Reduce rows
out = out.groupby('point') \
         .agg({'area': lambda x: x.tolist() if x.any() else [],
               'index_right': lambda x: dfa.loc[x, 'geometry'].tolist()
                                            if ~x.all() else []}) \
         .reset_index()

# Append columns
dfp = dfp.merge(out, on='point')

输出:

>>> dfp
  point                   geometry        area                                        index_right
0     A    POINT (3.00000 3.00000)          []                                                 []
1     B    POINT (8.00000 8.00000)       [foo]          [POLYGON ((5 5, 5 13, 13 13, 13 5, 5 5))]
2     C  POINT (11.00000 11.00000)  [foo, bar]  [POLYGON ((5 5, 5 13, 13 13, 13 5, 5 5)), POLY...