根据 DataFrame.within 方法的结果更新 GeoDataFrame 中值的有效方法
Efficient way to update values in a GeoDataFrame based on the result of DataFrame.within method
我有两个大的 GeoDataFrame:
一个来自 shapefile,其中每个多边形都有一个名为 'asapp' 的浮点值。
其次是 3x3 米的渔网网格的质心,列 'asapp' 归零。
我需要填充第二个底面的 'asapp',其中质心在第一个底面的多边形内。
下面的代码执行此操作,但速度低得可笑,每秒 15 个多边形(最小的 shapefile 之一有超过 20000 个多边形)。
# fishnet_grid is a dict created by GDAL with a raster with 3m pixel size
cells_in_wsg = np.array([(self.__convert_geom_sirgas(geom, ogr_transform), int(fid), 0.0) for fid, geom in fishnet_grid.items()])
# transforming the grid raster (which are square polygons) in a GeoDataframe of point using the centroids of the cells
fishnet_base = gpd.GeoDataFrame({'geometry': cells_in_wsg[..., 0], 'id': cells_in_wsg[..., 1], 'asapp': cells_in_wsg[..., 2]})
fishnet = gpd.GeoDataFrame({'geometry': fishnet_base.centroid, 'id': fishnet_base['id'], 'asapp': fishnet_base['asapp']})
# as_applied_data is the polygons GeoDataFrame
# the code below takes a lot of time to complete
for as_applied in as_applied_data.iterrows():
fishnet.loc[fishnet.within(as_applied[1]['geometry']), ['asapp']] += [as_applied[1]['asapp']]
还有另一种性能更好的方法吗?
Tys!
我解决了问题。
我阅读了有关使用 geopandas.overlay
(https://geopandas.org/en/stable/docs/user_guide/set_operations.html) 处理大量多边形的信息,但问题是它仅适用于多边形,而我有多边形和点。
因此,我的解决方案是从这些点创建非常小的多边形(边长为 2 厘米的正方形),然后使用叠加层。
最终代码:
# fishnet is now a GeoDataFrame of little squares
fishnet = gpd.GeoDataFrame({'geometry': cells_in_wsg[..., 0], 'id': cells_in_wsg[..., 1]})
#intersection has only the little squares that intersects with all as_applied_data polygons and the value in those polygons
intersection = gpd.overlay(fishnet, as_applied_data, how='intersection')
# now this is as easy as to calculate the mean and put it back in the fishnet using the merge
values = fishnet.merge(intersection.groupby(['id'], as_index=False).mean())
#and values has the the little squares, the geom_id and the mean values of the intersections!
效果很好!
我有两个大的 GeoDataFrame:
一个来自 shapefile,其中每个多边形都有一个名为 'asapp' 的浮点值。
其次是 3x3 米的渔网网格的质心,列 'asapp' 归零。
我需要填充第二个底面的 'asapp',其中质心在第一个底面的多边形内。
下面的代码执行此操作,但速度低得可笑,每秒 15 个多边形(最小的 shapefile 之一有超过 20000 个多边形)。
# fishnet_grid is a dict created by GDAL with a raster with 3m pixel size
cells_in_wsg = np.array([(self.__convert_geom_sirgas(geom, ogr_transform), int(fid), 0.0) for fid, geom in fishnet_grid.items()])
# transforming the grid raster (which are square polygons) in a GeoDataframe of point using the centroids of the cells
fishnet_base = gpd.GeoDataFrame({'geometry': cells_in_wsg[..., 0], 'id': cells_in_wsg[..., 1], 'asapp': cells_in_wsg[..., 2]})
fishnet = gpd.GeoDataFrame({'geometry': fishnet_base.centroid, 'id': fishnet_base['id'], 'asapp': fishnet_base['asapp']})
# as_applied_data is the polygons GeoDataFrame
# the code below takes a lot of time to complete
for as_applied in as_applied_data.iterrows():
fishnet.loc[fishnet.within(as_applied[1]['geometry']), ['asapp']] += [as_applied[1]['asapp']]
还有另一种性能更好的方法吗?
Tys!
我解决了问题。
我阅读了有关使用 geopandas.overlay
(https://geopandas.org/en/stable/docs/user_guide/set_operations.html) 处理大量多边形的信息,但问题是它仅适用于多边形,而我有多边形和点。
因此,我的解决方案是从这些点创建非常小的多边形(边长为 2 厘米的正方形),然后使用叠加层。
最终代码:
# fishnet is now a GeoDataFrame of little squares
fishnet = gpd.GeoDataFrame({'geometry': cells_in_wsg[..., 0], 'id': cells_in_wsg[..., 1]})
#intersection has only the little squares that intersects with all as_applied_data polygons and the value in those polygons
intersection = gpd.overlay(fishnet, as_applied_data, how='intersection')
# now this is as easy as to calculate the mean and put it back in the fishnet using the merge
values = fishnet.merge(intersection.groupby(['id'], as_index=False).mean())
#and values has the the little squares, the geom_id and the mean values of the intersections!
效果很好!