不同的形状文件,在多边形形状文件中找到点数
Different shapefiles, find number of points in multipolygon shapefile
我有一个包含数千个多边形的 shapefile,以及许多不同的 shapefile。接下来我提供了一个带有一些玩具实例的小 example/code(一个多边形 gpds 和两个点 gpds)。
import pandas as pd
import numpy as np
import geopandas as gpd
from shapely.geometry import Polygon
import random
# Create polygons
p1 = Polygon( [(0, 0), (0, 1), (0.5, 0.5)])
p2 = Polygon([(.75, 0), (1, .5), (.75, 1), (1.75, 0)])
p3 = Polygon([(2, 0), (2, 1), (3, 0)])
polygons = gpd.GeoDataFrame([p1, p2, p3], columns={'geometry'})
polygons['name'] = ["a", "b", "c"]
# Create red points
redPoints = gpd.GeoDataFrame(gpd.points_from_xy( np.random.uniform(0, 3, 50), np.random.uniform(0, 1, 50)), columns = ['geometry'])
# Create green points
greenPoints = gpd.GeoDataFrame(gpd.points_from_xy( np.random.uniform(1, 3, 50), np.random.uniform(0, 1, 50)), columns = ['geometry'])
# plot polygons, red and green points
# ax = polygons.plot()
# redPoints.plot(ax = ax, c = 'r')
# greenPoints.plot(ax = ax, c = 'g')
我需要使用多面体的每个 inside/intersects 来计算每种点的数量,并为每种点类型创建一个特定的列。我知道如何使用两个 shapefile 执行此操作,如下所示:
# count green (g) points
gpointsInPolygon = gpd.sjoin(polygons,greenPoints, op='intersects')
gpointsInPolygon['number of green points']=1
gpointsInPolygon = gpointsInPolygon.groupby(['name']).agg({'number of green points':sum}).reset_index()
cgps = gpointsInPolygon.set_index('name').join(polygons.set_index('name'), how ='right')
cgps.fillna(0, inplace=True)
# count red (r) points
rpointsInPolygon = gpd.sjoin(polygons,redPoints, op='intersects')
rpointsInPolygon['number of red points']=1
rpointsInPolygon = rpointsInPolygon.groupby(['name']).agg({'number of red points':sum}).reset_index()
crps = rpointsInPolygon.set_index('name').join(polygons.set_index('name'), how ='right')
crps.fillna(0, inplace=True)
#crs
redgreens = pd.merge(cgps, crps, on="geometry")
result = pd.merge(redgreens, polygons, on='geometry' )
我认为这不是一种有效的方法(不是吗?)。因为我有超过 50 个不同类型点的大型 shapefile。我正在寻找一种有效的方法来为大点形状文件实现这一点。谢谢!
能否将所有点合并到一个 GeoDataFrame 中,有一个 name/category 列('red'、'blue' 等),然后只进行一次连接?是这样的吗?
你给的数据
import pandas as pd
import numpy as np
import geopandas as gpd
from shapely.geometry import Polygon
import random
import matplotlib.pyplot as plt
# Create polygons
p1 = Polygon([(0, 0), (0, 1), (0.5, 0.5)])
p2 = Polygon([(0.75, 0), (1, 0.5), (0.75, 1), (1.75, 0)])
p3 = Polygon([(2, 0), (2, 1), (3, 0)])
polygons = gpd.GeoDataFrame([p1, p2, p3], columns={"geometry"})
polygons["name"] = ["a", "b", "c"]
# Create red points
redPoints = gpd.GeoDataFrame(
gpd.points_from_xy(np.random.uniform(0, 3, 50), np.random.uniform(0, 1, 50)),
columns=["geometry"],
)
# Create green points
greenPoints = gpd.GeoDataFrame(
gpd.points_from_xy(np.random.uniform(1, 3, 50), np.random.uniform(0, 1, 50)),
columns=["geometry"],
)
画出来检查
fig, ax = plt.subplots()
polygons.plot(ax=ax, edgecolor="blue", color="none")
greenPoints.plot(ax=ax, c="green")
redPoints.plot(ax=ax, c="red")
合并成一个GeoDataFrame
greenPoints["name"] = "green"
redPoints["name"] = "red"
allPoints = pd.concat([greenPoints, redPoints], axis=0)
display(allPoints)
相交/连接多边形和点
intersect = gpd.sjoin(polygons, allPoints)
display(intersect)
计数(最终答案)
mycounts = (
intersect.groupby(["name_left", "name_right"])
.size()
.reset_index(name="counts")
.rename(columns={"name_left": "polygon_id", "name_right": "point_id"})
)
display(mycounts)
根据 OP 的评论进行编辑:
是的,如果要添加多边形几何信息,可以使用pd.merge()
。您必须仔细管理您的列名。
在上面的回答中,我在mycounts = ...
中将多边形名称列重命名为polygon_id
。因此,如果您只是添加到我的答案中,您将使用以下内容来获取多边形几何列。:
mycounts2 = pd.merge(
mycounts,
polygons.copy().rename(columns={"name": "polygon_id"}),
how="left",
on="polygon_id",
)
这polygons.copy().rename(...
部分
- 制作
polygons
GeoDataFrame 的副本(这对于确保您不改变原始 GeoDataFrame 很重要!),并且
- 重命名列。
另请注意,您不能将点的几何形状添加到 mycounts
,因为 mycounts
DataFrame 是 groupby
的结果。如果您需要点的几何形状,则需要执行以下操作。同样,您将必须弄清楚数据集的列名管理。
pts2 = pd.merge(
allPoints,
intersect.copy().rename(columns={"name_right": "name"}),
how="left",
on="name",
)
new_cols = {
"geometry_x": "geom_pt",
"name": "id_pt",
"geometry_y": "geom_poly",
"name_left": "id_poly",
"index_right": "idx_pt",
}
pts2.rename(columns=new_cols, inplace=True)
我有一个包含数千个多边形的 shapefile,以及许多不同的 shapefile。接下来我提供了一个带有一些玩具实例的小 example/code(一个多边形 gpds 和两个点 gpds)。
import pandas as pd
import numpy as np
import geopandas as gpd
from shapely.geometry import Polygon
import random
# Create polygons
p1 = Polygon( [(0, 0), (0, 1), (0.5, 0.5)])
p2 = Polygon([(.75, 0), (1, .5), (.75, 1), (1.75, 0)])
p3 = Polygon([(2, 0), (2, 1), (3, 0)])
polygons = gpd.GeoDataFrame([p1, p2, p3], columns={'geometry'})
polygons['name'] = ["a", "b", "c"]
# Create red points
redPoints = gpd.GeoDataFrame(gpd.points_from_xy( np.random.uniform(0, 3, 50), np.random.uniform(0, 1, 50)), columns = ['geometry'])
# Create green points
greenPoints = gpd.GeoDataFrame(gpd.points_from_xy( np.random.uniform(1, 3, 50), np.random.uniform(0, 1, 50)), columns = ['geometry'])
# plot polygons, red and green points
# ax = polygons.plot()
# redPoints.plot(ax = ax, c = 'r')
# greenPoints.plot(ax = ax, c = 'g')
我需要使用多面体的每个 inside/intersects 来计算每种点的数量,并为每种点类型创建一个特定的列。我知道如何使用两个 shapefile 执行此操作,如下所示:
# count green (g) points
gpointsInPolygon = gpd.sjoin(polygons,greenPoints, op='intersects')
gpointsInPolygon['number of green points']=1
gpointsInPolygon = gpointsInPolygon.groupby(['name']).agg({'number of green points':sum}).reset_index()
cgps = gpointsInPolygon.set_index('name').join(polygons.set_index('name'), how ='right')
cgps.fillna(0, inplace=True)
# count red (r) points
rpointsInPolygon = gpd.sjoin(polygons,redPoints, op='intersects')
rpointsInPolygon['number of red points']=1
rpointsInPolygon = rpointsInPolygon.groupby(['name']).agg({'number of red points':sum}).reset_index()
crps = rpointsInPolygon.set_index('name').join(polygons.set_index('name'), how ='right')
crps.fillna(0, inplace=True)
#crs
redgreens = pd.merge(cgps, crps, on="geometry")
result = pd.merge(redgreens, polygons, on='geometry' )
我认为这不是一种有效的方法(不是吗?)。因为我有超过 50 个不同类型点的大型 shapefile。我正在寻找一种有效的方法来为大点形状文件实现这一点。谢谢!
能否将所有点合并到一个 GeoDataFrame 中,有一个 name/category 列('red'、'blue' 等),然后只进行一次连接?是这样的吗?
你给的数据
import pandas as pd
import numpy as np
import geopandas as gpd
from shapely.geometry import Polygon
import random
import matplotlib.pyplot as plt
# Create polygons
p1 = Polygon([(0, 0), (0, 1), (0.5, 0.5)])
p2 = Polygon([(0.75, 0), (1, 0.5), (0.75, 1), (1.75, 0)])
p3 = Polygon([(2, 0), (2, 1), (3, 0)])
polygons = gpd.GeoDataFrame([p1, p2, p3], columns={"geometry"})
polygons["name"] = ["a", "b", "c"]
# Create red points
redPoints = gpd.GeoDataFrame(
gpd.points_from_xy(np.random.uniform(0, 3, 50), np.random.uniform(0, 1, 50)),
columns=["geometry"],
)
# Create green points
greenPoints = gpd.GeoDataFrame(
gpd.points_from_xy(np.random.uniform(1, 3, 50), np.random.uniform(0, 1, 50)),
columns=["geometry"],
)
画出来检查
fig, ax = plt.subplots()
polygons.plot(ax=ax, edgecolor="blue", color="none")
greenPoints.plot(ax=ax, c="green")
redPoints.plot(ax=ax, c="red")
合并成一个GeoDataFrame
greenPoints["name"] = "green"
redPoints["name"] = "red"
allPoints = pd.concat([greenPoints, redPoints], axis=0)
display(allPoints)
相交/连接多边形和点
intersect = gpd.sjoin(polygons, allPoints)
display(intersect)
计数(最终答案)
mycounts = (
intersect.groupby(["name_left", "name_right"])
.size()
.reset_index(name="counts")
.rename(columns={"name_left": "polygon_id", "name_right": "point_id"})
)
display(mycounts)
根据 OP 的评论进行编辑:
是的,如果要添加多边形几何信息,可以使用pd.merge()
。您必须仔细管理您的列名。
在上面的回答中,我在mycounts = ...
中将多边形名称列重命名为polygon_id
。因此,如果您只是添加到我的答案中,您将使用以下内容来获取多边形几何列。:
mycounts2 = pd.merge(
mycounts,
polygons.copy().rename(columns={"name": "polygon_id"}),
how="left",
on="polygon_id",
)
这polygons.copy().rename(...
部分
- 制作
polygons
GeoDataFrame 的副本(这对于确保您不改变原始 GeoDataFrame 很重要!),并且 - 重命名列。
另请注意,您不能将点的几何形状添加到 mycounts
,因为 mycounts
DataFrame 是 groupby
的结果。如果您需要点的几何形状,则需要执行以下操作。同样,您将必须弄清楚数据集的列名管理。
pts2 = pd.merge(
allPoints,
intersect.copy().rename(columns={"name_right": "name"}),
how="left",
on="name",
)
new_cols = {
"geometry_x": "geom_pt",
"name": "id_pt",
"geometry_y": "geom_poly",
"name_left": "id_poly",
"index_right": "idx_pt",
}
pts2.rename(columns=new_cols, inplace=True)