如何在 geopandas 中将点与多边形连接起来

How to join point with polygon in geopandas

我有 lat-long1、lat2-long2 的多边形组合 ..... 和 Lat - Long 之类的点。

如果多边形内存在任何点,我已经使用 GeoPandas 库来获取结果。

保存在 csv 文件中的多边形示例数据:

  1. POLYGON((28.56056 77.36535,28.564635293716776 77.3675137204626,28.56871055311656 77.36967760850214,28.572785778190855 77.3718416641586,28.576860968931193 77.37400588747194,28.580936125329096 77.3761702784821,28.585011247376094 77.37833483722912,28.58908633506372 77.38049956375293,28.593161388383457 77.38266445809356,28.59723640732686 77.38482952029099,28.60131139188541 77.38699475038526,28.605386342050664 77.38916014841635,28.60946125781409 77.39132571442434,28.613536139167238 77.39349144844923,28.61761098610158 77.39565735053108,28.62168579860863 77.39782342070995,28.62576057667991 77.39998965902589,28.62983532030691 77.402156065519,28.633910029481108 77.40432264022931,28.637984704194054 77.40648938319696,28.642059344437207 77.408656294462,28.64068221074683 77.41187044231611,28.63920739580329 77.41502778244606,28.63763670052024 77.41812446187686,28.635972042808007 77.42115670220443,28.634215455216115 77.42412080422613,28.63236908243526 77.42701315247152,28.630435178662026 77.42983021962735,28.628416104829583 77.43256857085188,28.626314325707924 77.43522486797251,28.624132406877322 77.437795873562,28.621873011578572 77.44027845488824,28.619538897444272 77.4426695877325,28.617132913115164 77.44496636007166,28.614657994745563 77.44716597562005,28.612117162402576 77.44926575722634,28.609513516363293 77.45126315012166,28.606850233314923 77.45315572501488,28.604130562462267 77.45494118103147,28.60135782154758 77.45661734849246,28.598535392787774 77.45818219153013,28.595666718733966 77.45963381053753,28.592755298058414 77.46097044444889,28.589804681274302 77.46219047284835,28.586818466393503 77.46329241790465,28.583800294527727 77.46427494612952,28.58075384543836 77.46513686995802,28.57768283304089 77.46587714914885,28.574591000868892 77.4664948920035,28.571482117503592 77.46698935640259,28.568359971974488 77.46735995065883,28.565228369136484 77.46760623418534,28.56209112502966 77.4677279179792,28.558952062226695 77.4677248649196,28.55581500517431 77.46759708988064,28.552683775533943 77.46734475965891,28.552683775533943 77.46734475965891,28.553079397193876 77.4622453846313,28.553474828308865 77.45714597129259,28.55387006887434 77.4520465196603,28.554265118885752 77.44694702975198,28.554659978338513 77.4418475015852,28.555054647228083 77.43674793517746,28.555449125549913 77.43164833054634,28.555843413299442 77.42654868770937,28.55623751047213 77.42144900668411,28.556631417063407 77.41634928748812,28.55702513306874 77.41124953013893,28.55741865848359 77.40614973465412,28.557811993303396 77.40104990105122,28.55820513752363 77.39595002934782,28.558598091139757 77.39085011956145,28.558990854147225 77.38575017170969,28.559383426541523 77.3806501858101,28.559775808318093 77.37555016188024,28.560167999472434 77.37045009993768,28.56056 77.36535))

第二个数据集是 LAT 和 LONG 分别保存在 csv 文件中的 28.56282、77.36824。

如果点存在于多边形中,我使用下面的 Python 代码根据条件连接两个数据集。如下图

import pandas as pd
import shapely.geometry
from shapely.geometry import Point
import geopandas as gpd
site_df = pd.read_csv (r'lat_long_file.csv') # load lat and long file
site_df['geometry'] = pd.DataFrame(site_df).apply(lambda x: Point(x.LAT,x.LONG), axis='columns') # convert lat and long to point

gdf = gpd.GeoDataFrame(site_df, geometry = site_df.geometry,crs='EPSG:4326') #creating geo pandas data frame for point

from shapely import wkt
polygon_df = pd.read_csv (r'polygon_csv_file') #reading polygon sample raw string file
polygon_df['geometry'] = pd.DataFrame(polygon_df).apply(lambda row: shapely.wkt.loads(row.polygon), axis='columns') #converting string polygon to geometory

gd_polygon = gpd.GeoDataFrame(polygon_df, geometry = polygon_df.geometry,crs='EPSG:4326') #create geopandas dataframe


import shapely.speedups
shapely.speedups.enable() # this makes some spatial queries run faster

join_data = gpd.sjoin(gdf, gd_polygon, how="inner", op="within") //actual join condition

但是该查询不会返回任何内容。但是点存在于多边形内。如下图所示

绿色位置标记是存在于多边形内的经纬度点。

  • 您的示例数据无法使用,因为它是图像
  • 已找到一个多边形 - 英国的县界
  • 构建了一个 geopandas 该县内的点的数据框
  • 已使用 plotly 直观地展示数据
  • 已使用您的代码片段 gpd.sjoin(gdf, gd_polygon, how="inner", op="within") 进行空间连接并且它正确地将点连接到多边形
import requests, json
import geopandas as gpd
import plotly.express as px
import shapely.geometry

# fmt: off
# get a polygon and construct a point
res = requests.get("https://opendata.arcgis.com/datasets/69dc11c7386943b4ad8893c45648b1e1_0.geojson")
gd_polygon = gpd.GeoDataFrame.from_features(res.json()).loc[lambda d: d["LAD20NM"].str.contains("Hereford")]
gdf = gpd.GeoDataFrame(geometry=gd_polygon.loc[:,["LONG","LAT"]].apply(shapely.geometry.Point, axis=1)).reset_index(drop=True)
# fmt: on

# plot to show point is within polygon
px.scatter_mapbox(gd_polygon, lon="LONG", lat="LAT").update_traces(
    name="gd_polygon"
).add_traces(
    px.scatter_mapbox(gdf, lat=gdf2.geometry.y, lon=gdf2.geometry.x)
    .update_traces(name="gdf", marker_color="red")
    .data
).update_traces(
    showlegend=True
).update_layout(
    mapbox={
        "style": "carto-positron",
        "layers": [
            {"source": json.loads(gd_polygon.geometry.to_json()), "type": "line"}
        ],
    }
).show()

# spatial join, all good :-)
gpd.sjoin(gdf, gd_polygon, how="inner", op="within")

输出

  • 空间连接有效,点在多边形内
geometry index_right OBJECTID LAD20CD LAD20NM LAD20NMW BNG_E BNG_N LONG LAT Shape__Area Shape__Length
0 POINT (-2.73931 52.081539) 18 19 E06000019 Herefordshire, County of 349434 242834 -2.73931 52.0815 2.18054e+09 285427

我会检查轴顺序 - WKT 通常解释为经度第一,纬度第二顺序,而您构造的点使用 latitude:longitude 顺序。

您可以尝试删除 CRS 标识符,看看它是否会改变结果。

另见 https://gis.stackexchange.com/questions/376751/shapely-flips-lat-long-coordinatehttps://pyproj4.github.io/pyproj/stable/gotchas.html#axis-order-changes-in-proj-6