不同的形状文件,在多边形形状文件中找到点数

Different shapefiles, find number of points in multipolygon shapefile

我有一个包含数千个多边形的 shapefile,以及许多不同的 shapefile。接下来我提供了一个带有一些玩具实例的小 example/code(一个多边形 gpds 和两个点 gpds)。

import pandas as pd
import numpy as np
import geopandas as gpd
from shapely.geometry import Polygon

import random


# Create polygons
p1 = Polygon( [(0, 0), (0, 1), (0.5, 0.5)])
p2 = Polygon([(.75, 0), (1, .5), (.75, 1), (1.75, 0)])
p3 = Polygon([(2, 0), (2, 1), (3, 0)])
polygons = gpd.GeoDataFrame([p1, p2, p3], columns={'geometry'})
polygons['name'] = ["a", "b", "c"]


# Create red points
redPoints =  gpd.GeoDataFrame(gpd.points_from_xy( np.random.uniform(0, 3, 50), np.random.uniform(0, 1, 50)), columns = ['geometry'])

# Create green points
greenPoints =  gpd.GeoDataFrame(gpd.points_from_xy( np.random.uniform(1, 3, 50), np.random.uniform(0, 1, 50)), columns = ['geometry'])

# plot polygons, red and green points
# ax = polygons.plot()
# redPoints.plot(ax = ax, c = 'r')
# greenPoints.plot(ax = ax, c = 'g')



我需要使用多面体的每个 inside/intersects 来计算每种点的数量,并为每种点类型创建一个特定的列。我知道如何使用两个 shapefile 执行此操作,如下所示:


# count green (g) points
gpointsInPolygon = gpd.sjoin(polygons,greenPoints, op='intersects')
gpointsInPolygon['number of green points']=1
gpointsInPolygon = gpointsInPolygon.groupby(['name']).agg({'number of green points':sum}).reset_index()

cgps = gpointsInPolygon.set_index('name').join(polygons.set_index('name'), how ='right')
cgps.fillna(0, inplace=True)



# count red (r) points
rpointsInPolygon = gpd.sjoin(polygons,redPoints, op='intersects')
rpointsInPolygon['number of red points']=1
rpointsInPolygon = rpointsInPolygon.groupby(['name']).agg({'number of red points':sum}).reset_index()

crps = rpointsInPolygon.set_index('name').join(polygons.set_index('name'), how ='right')
crps.fillna(0, inplace=True)
#crs



redgreens = pd.merge(cgps, crps, on="geometry")

result = pd.merge(redgreens, polygons, on='geometry' )

我认为这不是一种有效的方法(不是吗?)。因为我有超过 50 个不同类型点的大型 shapefile。我正在寻找一种有效的方法来为大点形状文件实现这一点。谢谢!

能否将所有点合并到一个 GeoDataFrame 中,有一个 name/category 列('red'、'blue' 等),然后只进行一次连接?是这样的吗?

你给的数据

import pandas as pd
import numpy as np
import geopandas as gpd
from shapely.geometry import Polygon
import random
import matplotlib.pyplot as plt

# Create polygons
p1 = Polygon([(0, 0), (0, 1), (0.5, 0.5)])
p2 = Polygon([(0.75, 0), (1, 0.5), (0.75, 1), (1.75, 0)])
p3 = Polygon([(2, 0), (2, 1), (3, 0)])
polygons = gpd.GeoDataFrame([p1, p2, p3], columns={"geometry"})
polygons["name"] = ["a", "b", "c"]

# Create red points
redPoints = gpd.GeoDataFrame(
    gpd.points_from_xy(np.random.uniform(0, 3, 50), np.random.uniform(0, 1, 50)),
    columns=["geometry"],
)

# Create green points
greenPoints = gpd.GeoDataFrame(
    gpd.points_from_xy(np.random.uniform(1, 3, 50), np.random.uniform(0, 1, 50)),
    columns=["geometry"],
)

画出来检查

fig, ax = plt.subplots()
polygons.plot(ax=ax, edgecolor="blue", color="none")
greenPoints.plot(ax=ax, c="green")
redPoints.plot(ax=ax, c="red")

合并成一个GeoDataFrame

greenPoints["name"] = "green"
redPoints["name"] = "red"
allPoints = pd.concat([greenPoints, redPoints], axis=0)
display(allPoints)

相交/连接多边形和点

intersect = gpd.sjoin(polygons, allPoints)
display(intersect)

计数(最终答案)

mycounts = (
    intersect.groupby(["name_left", "name_right"])
    .size()
    .reset_index(name="counts")
    .rename(columns={"name_left": "polygon_id", "name_right": "point_id"})
)
display(mycounts)

根据 OP 的评论进行编辑:

是的,如果要添加多边形几何信息,可以使用pd.merge()。您必须仔细管理您的列名。

在上面的回答中,我在mycounts = ...中将多边形名称列重命名为polygon_id。因此,如果您只是添加到我的答案中,您将使用以下内容来获取多边形几何列。:

mycounts2 = pd.merge(
    mycounts,
    polygons.copy().rename(columns={"name": "polygon_id"}),
    how="left",
    on="polygon_id",
)

polygons.copy().rename(...部分

  1. 制作 polygons GeoDataFrame 的副本(这对于确保您不改变原始 GeoDataFrame 很重要!),并且
  2. 重命名列。

另请注意,您不能将点的几何形状添加到 mycounts,因为 mycounts DataFrame 是 groupby 的结果。如果您需要点的几何形状,则需要执行以下操作。同样,您将必须弄清楚数据集的列名管理。

pts2 = pd.merge(
    allPoints,
    intersect.copy().rename(columns={"name_right": "name"}),
    how="left",
    on="name",
)

new_cols = {
    "geometry_x": "geom_pt",
    "name": "id_pt",
    "geometry_y": "geom_poly",
    "name_left": "id_poly",
    "index_right": "idx_pt",
}

pts2.rename(columns=new_cols, inplace=True)