仅在形状 geopandas 中获取值
Get values only within the shape geopandas
我有一个包含地理数据的大数据集。我只想提取与城市部分相关的数据。所以我需要创建一个形状并检查我的点是否在那里。
我尝试使用 :
积分
1)
turin.head() =
latitude longitude
0 44.9125 7.7432
21 45.0764 7.5249
22 45.0764 7.5249
23 45.0755 7.5248
24 45.0718 7.5236
2)
geometry = [Point(xy) for xy in zip(turin.longitude,turin.latitude)]
turin_point = gpd.GeoDataFrame(turin,crs=crs,geometry=geometry)
turin_point.head()
latitude longitude geometry
0 44.9125 7.7432 POINT (7.74320 44.91250)
21 45.0764 7.5249 POINT (7.52490 45.07640)
22 45.0764 7.5249 POINT (7.52490 45.07640)
23 45.0755 7.5248 POINT (7.52480 45.07550)
24 45.0718 7.5236 POINT (7.52360 45.07180)
边框
1)
border.head()
longitude latitude
0 7.577835 45.041828
1 7.579849 45.039877
2 7.580106 45.039628
3 7.580852 45.038576
4 7.580866 45.038556
2)
geometry2 = [Point(xy) for xy in zip(border.longitude,border.latitude)]
border_point = gpd.GeoDataFrame(border,crs=crs,geometry=geometry2)
border_point.head() =
longitude latitude geometry
0 7.577835 45.041828 POINT (7.57783 45.04183)
1 7.579849 45.039877 POINT (7.57985 45.03988)
2 7.580106 45.039628 POINT (7.58011 45.03963)
3 7.580852 45.038576 POINT (7.58085 45.03858)
4 7.580866 45.038556 POINT (7.58087 45.03856)
然后根据:
turin_final= border_point.geometry.unary_union
within_turin = turin_point[turin_point.geometry.within(turin_final)]
IndexError: too many indices for array
如果您想在边界内查找点,边界本身必须是多边形,而不是另一组点。如果边界点的坐标顺序正确,您可以尝试用这些替换最后两行:
from shapely.geometry import Polygon
turin_final = Polygon([[p.x, p.y] for p in border_point.geometry])
within_turin = turin_point[turin_point.geometry.within(turin_final)]
通常情况并非如此,但正如我从您的评论中了解到的那样,它已经解决了您的问题。
我有一个包含地理数据的大数据集。我只想提取与城市部分相关的数据。所以我需要创建一个形状并检查我的点是否在那里。
我尝试使用
积分
1)
turin.head() =
latitude longitude
0 44.9125 7.7432
21 45.0764 7.5249
22 45.0764 7.5249
23 45.0755 7.5248
24 45.0718 7.5236
2)
geometry = [Point(xy) for xy in zip(turin.longitude,turin.latitude)]
turin_point = gpd.GeoDataFrame(turin,crs=crs,geometry=geometry)
turin_point.head()
latitude longitude geometry
0 44.9125 7.7432 POINT (7.74320 44.91250)
21 45.0764 7.5249 POINT (7.52490 45.07640)
22 45.0764 7.5249 POINT (7.52490 45.07640)
23 45.0755 7.5248 POINT (7.52480 45.07550)
24 45.0718 7.5236 POINT (7.52360 45.07180)
边框
1)
border.head()
longitude latitude
0 7.577835 45.041828
1 7.579849 45.039877
2 7.580106 45.039628
3 7.580852 45.038576
4 7.580866 45.038556
2)
geometry2 = [Point(xy) for xy in zip(border.longitude,border.latitude)]
border_point = gpd.GeoDataFrame(border,crs=crs,geometry=geometry2)
border_point.head() =
longitude latitude geometry
0 7.577835 45.041828 POINT (7.57783 45.04183)
1 7.579849 45.039877 POINT (7.57985 45.03988)
2 7.580106 45.039628 POINT (7.58011 45.03963)
3 7.580852 45.038576 POINT (7.58085 45.03858)
4 7.580866 45.038556 POINT (7.58087 45.03856)
然后根据:
turin_final= border_point.geometry.unary_union
within_turin = turin_point[turin_point.geometry.within(turin_final)]
IndexError: too many indices for array
如果您想在边界内查找点,边界本身必须是多边形,而不是另一组点。如果边界点的坐标顺序正确,您可以尝试用这些替换最后两行:
from shapely.geometry import Polygon
turin_final = Polygon([[p.x, p.y] for p in border_point.geometry])
within_turin = turin_point[turin_point.geometry.within(turin_final)]
通常情况并非如此,但正如我从您的评论中了解到的那样,它已经解决了您的问题。