geopandas 绘图 - 识别落在地图之外的位置
geopandas plotting - Identify locations that fall outside of the map
我有一张 shapefile
显示巴基斯坦地区地图。我还有一个 geodataframe
,里面有关于巴基斯坦投票站的信息。
我已将 geodataframe
映射到 shapefile
,但注意到 geodataframe
中的某些 lat/lon 值是错误的,即它们位于巴基斯坦境外。
我想确定这些是哪些投票站。 (我想 select 来自 geodataframe
的那些行)有没有办法做到这一点?
请参考下图-黑点表示投票站,彩色地图为巴基斯坦地区地图:
image_pakistan_map_pollingstations
编辑:
所以我正在尝试这个,它似乎有效,但是 运行 需要很长时间(现在已经 运行 超过 5 小时) - 作为参考, geodataframe 有大约 50,000 行,它被称为 ours_NA_gdf.
for i in range(len(ours_NA_gdf)):
if ours_NA_gdf['geometry'][i].within(pakistan['geometry'][0]):
ours_NA_gdf.at[i, 'loc_validity'] = 'T'
else:
ours_NA_gdf.at[i, 'loc_validity'] = 'F'
ours_NA_gdf[ours_NA_gdf['loc_validity']=='F']
我怀疑你使用的巴基斯坦几何图形有问题。它们过于复杂和详细而无法使用。在您的用例中,naturalearth_lowres
提供的简单几何应该提供更好的性能。在这里,我提供了一个可运行的代码,演示了如何使用简单的巴基斯坦几何图形来执行 contains()
操作,并分配要在地图上绘制的点的属性 color
。
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from cartopy import crs as ccrs
# create a geoDataFrame of points locations across Pakistan areas
pp = 40
lons = np.linspace(60, 80, pp)
lats = np.linspace(22, 39, pp)
# create point geometry
# points will be plotted across Pakistan in red (outside) and green (inside)
points = [Point(xy) for xy in zip(lons, lats)]
# create a dataframe of 3 columns
mydf = pd.DataFrame({'longitude': lons, 'latitude': lats, 'point': points})
# manipulate dataframe geometry
gdf = mydf.drop(['longitude', 'latitude'], axis=1)
gdf = gpd.GeoDataFrame(gdf, crs="EPSG:4326", geometry=gdf.point)
fig, ax = plt.subplots(figsize=(6,7), subplot_kw={'projection': ccrs.PlateCarree()})
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
parki = world[(world.name == "Pakistan")] #take a country of interest
# grab the geometry of Pakistan
# can .simplify() it if need be
pg = parki['geometry']
newcol = []
for index, row in gdf.iterrows(): # Looping over all points
res = pg.contains( row.geometry).values[0]
newcol.append(res)
# add a new column ('insideQ') to the geodataframe
gdf['insideQ'] = newcol
# add a new column ('color') to the geodataframe
gdf.loc[:, 'color'] = 'green' #set color='green'
# this set color='red' to selected rows
gdf.loc[gdf['insideQ']==False, 'color'] = 'red'
# plot Pakistan
ax.add_geometries(parki['geometry'], crs=ccrs.PlateCarree(), color='lightpink', label='Pakistan')
# plot all points features of `gdf`
gdf.plot(ax=ax, zorder=20, color=gdf.color)
ax.set_extent([60, 80, 22, 39]) #zoomed-in to Pakistan
LegendElement = [
mpatches.Patch(color='lightpink', label='Pakistan')
]
ax.legend(handles = LegendElement, loc='best')
plt.show()
输出图:
我有一张 shapefile
显示巴基斯坦地区地图。我还有一个 geodataframe
,里面有关于巴基斯坦投票站的信息。
我已将 geodataframe
映射到 shapefile
,但注意到 geodataframe
中的某些 lat/lon 值是错误的,即它们位于巴基斯坦境外。
我想确定这些是哪些投票站。 (我想 select 来自 geodataframe
的那些行)有没有办法做到这一点?
请参考下图-黑点表示投票站,彩色地图为巴基斯坦地区地图:
image_pakistan_map_pollingstations
编辑:
所以我正在尝试这个,它似乎有效,但是 运行 需要很长时间(现在已经 运行 超过 5 小时) - 作为参考, geodataframe 有大约 50,000 行,它被称为 ours_NA_gdf.
for i in range(len(ours_NA_gdf)):
if ours_NA_gdf['geometry'][i].within(pakistan['geometry'][0]):
ours_NA_gdf.at[i, 'loc_validity'] = 'T'
else:
ours_NA_gdf.at[i, 'loc_validity'] = 'F'
ours_NA_gdf[ours_NA_gdf['loc_validity']=='F']
我怀疑你使用的巴基斯坦几何图形有问题。它们过于复杂和详细而无法使用。在您的用例中,naturalearth_lowres
提供的简单几何应该提供更好的性能。在这里,我提供了一个可运行的代码,演示了如何使用简单的巴基斯坦几何图形来执行 contains()
操作,并分配要在地图上绘制的点的属性 color
。
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from cartopy import crs as ccrs
# create a geoDataFrame of points locations across Pakistan areas
pp = 40
lons = np.linspace(60, 80, pp)
lats = np.linspace(22, 39, pp)
# create point geometry
# points will be plotted across Pakistan in red (outside) and green (inside)
points = [Point(xy) for xy in zip(lons, lats)]
# create a dataframe of 3 columns
mydf = pd.DataFrame({'longitude': lons, 'latitude': lats, 'point': points})
# manipulate dataframe geometry
gdf = mydf.drop(['longitude', 'latitude'], axis=1)
gdf = gpd.GeoDataFrame(gdf, crs="EPSG:4326", geometry=gdf.point)
fig, ax = plt.subplots(figsize=(6,7), subplot_kw={'projection': ccrs.PlateCarree()})
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
parki = world[(world.name == "Pakistan")] #take a country of interest
# grab the geometry of Pakistan
# can .simplify() it if need be
pg = parki['geometry']
newcol = []
for index, row in gdf.iterrows(): # Looping over all points
res = pg.contains( row.geometry).values[0]
newcol.append(res)
# add a new column ('insideQ') to the geodataframe
gdf['insideQ'] = newcol
# add a new column ('color') to the geodataframe
gdf.loc[:, 'color'] = 'green' #set color='green'
# this set color='red' to selected rows
gdf.loc[gdf['insideQ']==False, 'color'] = 'red'
# plot Pakistan
ax.add_geometries(parki['geometry'], crs=ccrs.PlateCarree(), color='lightpink', label='Pakistan')
# plot all points features of `gdf`
gdf.plot(ax=ax, zorder=20, color=gdf.color)
ax.set_extent([60, 80, 22, 39]) #zoomed-in to Pakistan
LegendElement = [
mpatches.Patch(color='lightpink', label='Pakistan')
]
ax.legend(handles = LegendElement, loc='best')
plt.show()
输出图: