我如何 select 对象在 pandas 数据框中的地理区域内
How do i select objects within a geographic regions in a pandas dataframe
我正在尝试从 pandas 数据框中 selection 对象,该数据框包含项目 ID 和经纬度对的列表。有没有 selection 方法?
我认为这与这个 SO 问题类似,但使用 PANDAS 而不是 SQL
Selecting geographical points within area
这是我在 locations.csv
中保存的 table
ID, LAT, LON
001,35.00,-75.00
002,35.01,-80.00
...
999,25.76,-64.00
我可以加载数据框,select一个矩形区域:
import pandas as pd
df = pd.read_csv('locations.csv', delimiter=',')
lat_max = 32.323496
lat_min = 25.712767
lon_max = -72.863358
lon_min = -74.729456
small_df = df[df['LAT'] > lat_min][df['LAT'] < lat_max][df['LON'] < lon_max][df['LON'] > lon_min]
我如何 select 不规则区域内的对象?
我将如何构造数据帧 selection 命令?
我可以构建一个 lambda 函数,该函数将为区域内的 LAT 和 LON 生成一个 True 值,但我不确定如何将其与 pandas 数据框一起使用。
一个区域内 select 点的过程由下面的工作代码执行,从创建 2 个地理数据框开始。第一个包含一个多边形,第二个包含与第一个做 spatial join
的所有点。空间连接运算符 within
用于启用落在多边形内的点 selected。操作的结果也是一个地理数据框,它只包含落在多边形区域内的所需点。
locations.csv
的内容; 6 行,第 headers 列。
注意:第一行没有空格。
ID,LAT,LON
1, 15.1, 10.0
2, 15.2, 15.1
3, 15.3, 20.2
4, 15.4, 25.3
5, 15.5, 30.4
代码:
import pandas as pd
import geopandas as gpd
from shapely import wkt
from shapely.geometry import Point, Polygon
from shapely.wkt import loads
# Create a geo-dataframe `polygon_df` having 1 row of polygon
# This polygon will be used to select points in a geodataframe
d = {'poly_id':[1], 'wkt':['POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))']}
df = pd.DataFrame( data=d )
geometry = [loads(pgon) for pgon in df.wkt]
polygon_df = gpd.GeoDataFrame(df, \
crs={'init': 'epsg:4326'}, \
geometry=geometry)
# One can plot this polygon with the command:
# polygon_df.plot()
# Read the file with `pandas`
locs = pd.read_csv('locations.csv', sep=',')
# Making it a geo-dataframe with new name: `geo_locs`
geo_locs = gpd.GeoDataFrame(locs, crs={'init': 'epsg:4326'})
locs_geom = [Point(xy) for xy in zip(geo_locs.LON, geo_locs.LAT)]
geo_locs['wkt'] = geo_locs.apply( lambda x: Point(x.LON, x.LAT), axis=1 )
geo_locs = gpd.GeoDataFrame(geo_locs, crs={'init': 'epsg:4326'}, \
geometry=geo_locs['wkt'])
# Do a spatial join of `point` within `polygon`, get the result in `pts_in_poly` GeodataFrame.
pts_in_poly = gpd.sjoin(geo_locs, polygon_df, op='within', how='inner')
# Print the ID of the points that fall within the polygon.
print(pts_in_poly.ID)
# The output will be:
#2 3
#3 4
#4 5
#Name: ID, dtype: int64
# Plot the polygon and all the points.
ax1 = polygon_df.plot(color='lightgray', zorder=1)
geo_locs.plot(ax=ax1, zorder=5, color="red")
输出图:
在图中,ID 为 3、4 和 5 的点落在多边形内。
我正在尝试从 pandas 数据框中 selection 对象,该数据框包含项目 ID 和经纬度对的列表。有没有 selection 方法? 我认为这与这个 SO 问题类似,但使用 PANDAS 而不是 SQL
Selecting geographical points within area
这是我在 locations.csv
中保存的 tableID, LAT, LON
001,35.00,-75.00
002,35.01,-80.00
...
999,25.76,-64.00
我可以加载数据框,select一个矩形区域:
import pandas as pd
df = pd.read_csv('locations.csv', delimiter=',')
lat_max = 32.323496
lat_min = 25.712767
lon_max = -72.863358
lon_min = -74.729456
small_df = df[df['LAT'] > lat_min][df['LAT'] < lat_max][df['LON'] < lon_max][df['LON'] > lon_min]
我如何 select 不规则区域内的对象?
我将如何构造数据帧 selection 命令?
我可以构建一个 lambda 函数,该函数将为区域内的 LAT 和 LON 生成一个 True 值,但我不确定如何将其与 pandas 数据框一起使用。
一个区域内 select 点的过程由下面的工作代码执行,从创建 2 个地理数据框开始。第一个包含一个多边形,第二个包含与第一个做 spatial join
的所有点。空间连接运算符 within
用于启用落在多边形内的点 selected。操作的结果也是一个地理数据框,它只包含落在多边形区域内的所需点。
locations.csv
的内容; 6 行,第 headers 列。
注意:第一行没有空格。
ID,LAT,LON
1, 15.1, 10.0
2, 15.2, 15.1
3, 15.3, 20.2
4, 15.4, 25.3
5, 15.5, 30.4
代码:
import pandas as pd
import geopandas as gpd
from shapely import wkt
from shapely.geometry import Point, Polygon
from shapely.wkt import loads
# Create a geo-dataframe `polygon_df` having 1 row of polygon
# This polygon will be used to select points in a geodataframe
d = {'poly_id':[1], 'wkt':['POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))']}
df = pd.DataFrame( data=d )
geometry = [loads(pgon) for pgon in df.wkt]
polygon_df = gpd.GeoDataFrame(df, \
crs={'init': 'epsg:4326'}, \
geometry=geometry)
# One can plot this polygon with the command:
# polygon_df.plot()
# Read the file with `pandas`
locs = pd.read_csv('locations.csv', sep=',')
# Making it a geo-dataframe with new name: `geo_locs`
geo_locs = gpd.GeoDataFrame(locs, crs={'init': 'epsg:4326'})
locs_geom = [Point(xy) for xy in zip(geo_locs.LON, geo_locs.LAT)]
geo_locs['wkt'] = geo_locs.apply( lambda x: Point(x.LON, x.LAT), axis=1 )
geo_locs = gpd.GeoDataFrame(geo_locs, crs={'init': 'epsg:4326'}, \
geometry=geo_locs['wkt'])
# Do a spatial join of `point` within `polygon`, get the result in `pts_in_poly` GeodataFrame.
pts_in_poly = gpd.sjoin(geo_locs, polygon_df, op='within', how='inner')
# Print the ID of the points that fall within the polygon.
print(pts_in_poly.ID)
# The output will be:
#2 3
#3 4
#4 5
#Name: ID, dtype: int64
# Plot the polygon and all the points.
ax1 = polygon_df.plot(color='lightgray', zorder=1)
geo_locs.plot(ax=ax1, zorder=5, color="red")
输出图:
在图中,ID 为 3、4 和 5 的点落在多边形内。