Python shapely:将点聚合到 Choropleth 地图的形状文件
Python shapely: Aggregating points to shape files for a Choropleth map
我正在尝试在 Python3 中创建一个 Choropleth,使用匀称的 fiona 和散景进行显示。
我有一个包含大约 7000 行的文件,其中包含城镇和柜台的位置。
示例:
54.7604;9.55827;208
54.4004;9.95918;207
53.8434;9.95271;203
53.5979;10.0013;201
53.728;10.2526;197
53.646;10.0403;196
54.3977;10.1054;193
52.4385;9.39217;193
53.815;10.3476;192
...
我想在 12.5 公里的网格中显示这些,其 shapefile 可在
https://opendata-esri-de.opendata.arcgis.com/datasets/3c1f46241cbb4b669e18b002e4893711_0
我的代码有效。
它非常慢,因为它是一种蛮力算法,会根据所有 7000 个点检查 7127 个网格点中的每一个。
import pandas as pd
import fiona
from shapely.geometry import Polygon, Point, MultiPoint, MultiPolygon
from shapely.prepared import prep
sf = r'c:\Temp\geo_de\Hexagone_125_km\Hexagone_125_km.shp'
shp = fiona.open(sf)
district_xy = [ [ xy for xy in feat["geometry"]["coordinates"][0]] for feat in shp]
district_poly = [ Polygon(xy) for xy in district_xy] # coords to Polygon
df_p = pd.read_csv('points_file.csv', sep=';', header=None)
df_p.columns = ('lat', 'lon', 'count')
map_points = [Point(x,y) for x,y in zip(df_p.lon, df_p.lat)] # Convert Points to Shapely Points
all_points = MultiPoint(map_points) # all points
def calc_points_per_poly(poly, points, values): # Returns total for poly
poly = prep(poly)
return sum([v for p, v in zip(points, values) if poly.contains(p)])
# this is the slow part
# for each shape this sums um the points
sum_hex = [calc_points_per_poly(x, all_points, df_p['count']) for x in district_poly]
因为这非常慢,我想知道是否有更快的方法来获取 num_hex 值,特别是因为现实世界的点列表可能会大很多,而具有更多形状的更小网格会提供更好的结果。
我建议使用 'geopandas' 及其内置的 rtree 空间索引。只有当点有可能位于多边形内时,它才允许您进行检查。
import pandas as pd
import geopandas as gpd
from shapely.geometry import Polygon, Point
sf = 'Hexagone_125_km.shp'
shp = gpd.read_file(sf)
df_p = pd.read_csv('points_file.csv', sep=';', header=None)
df_p.columns = ('lat', 'lon', 'count')
gdf_p = gpd.GeoDataFrame(df_p, geometry=[Point(x,y) for x,y in zip(df_p.lon, df_p.lat)])
sum_hex = []
spatial_index = gdf_p.sindex
for index, row in shp.iterrows():
polygon = row.geometry
possible_matches_index = list(spatial_index.intersection(polygon.bounds))
possible_matches = gdf_p.iloc[possible_matches_index]
precise_matches = possible_matches[possible_matches.within(polygon)]
sum_hex.append(sum(precise_matches['count']))
shp['sum'] = sum_hex
这个解决方案应该比你的更快。然后,您可以通过 Bokeh 绘制 GeoDataFrame。如果您想了解有关空间索引的更多详细信息,我推荐 Geoff Boeing 的这篇文章:https://geoffboeing.com/2016/10/r-tree-spatial-index-python/
我正在尝试在 Python3 中创建一个 Choropleth,使用匀称的 fiona 和散景进行显示。
我有一个包含大约 7000 行的文件,其中包含城镇和柜台的位置。
示例:
54.7604;9.55827;208
54.4004;9.95918;207
53.8434;9.95271;203
53.5979;10.0013;201
53.728;10.2526;197
53.646;10.0403;196
54.3977;10.1054;193
52.4385;9.39217;193
53.815;10.3476;192
...
我想在 12.5 公里的网格中显示这些,其 shapefile 可在 https://opendata-esri-de.opendata.arcgis.com/datasets/3c1f46241cbb4b669e18b002e4893711_0
我的代码有效。
它非常慢,因为它是一种蛮力算法,会根据所有 7000 个点检查 7127 个网格点中的每一个。
import pandas as pd
import fiona
from shapely.geometry import Polygon, Point, MultiPoint, MultiPolygon
from shapely.prepared import prep
sf = r'c:\Temp\geo_de\Hexagone_125_km\Hexagone_125_km.shp'
shp = fiona.open(sf)
district_xy = [ [ xy for xy in feat["geometry"]["coordinates"][0]] for feat in shp]
district_poly = [ Polygon(xy) for xy in district_xy] # coords to Polygon
df_p = pd.read_csv('points_file.csv', sep=';', header=None)
df_p.columns = ('lat', 'lon', 'count')
map_points = [Point(x,y) for x,y in zip(df_p.lon, df_p.lat)] # Convert Points to Shapely Points
all_points = MultiPoint(map_points) # all points
def calc_points_per_poly(poly, points, values): # Returns total for poly
poly = prep(poly)
return sum([v for p, v in zip(points, values) if poly.contains(p)])
# this is the slow part
# for each shape this sums um the points
sum_hex = [calc_points_per_poly(x, all_points, df_p['count']) for x in district_poly]
因为这非常慢,我想知道是否有更快的方法来获取 num_hex 值,特别是因为现实世界的点列表可能会大很多,而具有更多形状的更小网格会提供更好的结果。
我建议使用 'geopandas' 及其内置的 rtree 空间索引。只有当点有可能位于多边形内时,它才允许您进行检查。
import pandas as pd
import geopandas as gpd
from shapely.geometry import Polygon, Point
sf = 'Hexagone_125_km.shp'
shp = gpd.read_file(sf)
df_p = pd.read_csv('points_file.csv', sep=';', header=None)
df_p.columns = ('lat', 'lon', 'count')
gdf_p = gpd.GeoDataFrame(df_p, geometry=[Point(x,y) for x,y in zip(df_p.lon, df_p.lat)])
sum_hex = []
spatial_index = gdf_p.sindex
for index, row in shp.iterrows():
polygon = row.geometry
possible_matches_index = list(spatial_index.intersection(polygon.bounds))
possible_matches = gdf_p.iloc[possible_matches_index]
precise_matches = possible_matches[possible_matches.within(polygon)]
sum_hex.append(sum(precise_matches['count']))
shp['sum'] = sum_hex
这个解决方案应该比你的更快。然后,您可以通过 Bokeh 绘制 GeoDataFrame。如果您想了解有关空间索引的更多详细信息,我推荐 Geoff Boeing 的这篇文章:https://geoffboeing.com/2016/10/r-tree-spatial-index-python/