使用特征中的 id 作为 geopandas 数据帧索引
use id from feature as geopandas dataframe index
我想要实现的是将 geojson 文件转换为 GeoDataFrame,同时将特征“id”移动到索引中。示例代码:
import geopandas as gpd
import json
data = {"type": "FeatureCollection", "features": [
{"geometry": {"coordinates": [[-1, -1], [0, 1], [1, 1], [-1, -1]], "type": "LineString"}, "id": 123, "properties": {"building": "house"}, "type": "Feature"},
{"geometry": {"coordinates": [[-2, -2], [0, 2], [2, 2], [-2, -2]], "type": "LineString"}, "id": 456, "properties": {"building": "apartments"}, "type": "Feature"}
]}
with open('/tmp/foo.json', 'w') as f: json.dump(data, f)
gpd.read_file('/tmp/foo.json')
问题是 id 被简单地丢弃了,而是使用自动递增 RangeIndex
。
building geometry
0 house LINESTRING (-1.00000 -1.00000, 0.00000 1.00000...
1 apartments LINESTRING (-2.00000 -2.00000, 0.00000 2.00000...
能否请您指教如何以优雅的方式解决这种情况?我是否应该像这样收集 ID 并手动设置索引:
gdf.index = [x['id'] for x in data['features']]
您可以使用pandas json_normalize()
从geojson和set_index()
中提取到设置它。
import geopandas as gpd
import json
import pandas as pd
data = {"type": "FeatureCollection", "features": [
{"geometry": {"coordinates": [[-1, -1], [0, 1], [1, 1], [-1, -1]], "type": "LineString"}, "id": 123, "properties": {"building": "house"}, "type": "Feature"},
{"geometry": {"coordinates": [[-2, -2], [0, 2], [2, 2], [-2, -2]], "type": "LineString"}, "id": 456, "properties": {"building": "apartments"}, "type": "Feature"}
]}
gpd.GeoDataFrame.from_features(data).set_index(pd.json_normalize(data["features"])["id"].values)
geometry
building
123
LINESTRING (-1 -1, 0 1, 1 1, -1 -1)
house
456
LINESTRING (-2 -2, 0 2, 2 2, -2 -2)
apartments
我想要实现的是将 geojson 文件转换为 GeoDataFrame,同时将特征“id”移动到索引中。示例代码:
import geopandas as gpd
import json
data = {"type": "FeatureCollection", "features": [
{"geometry": {"coordinates": [[-1, -1], [0, 1], [1, 1], [-1, -1]], "type": "LineString"}, "id": 123, "properties": {"building": "house"}, "type": "Feature"},
{"geometry": {"coordinates": [[-2, -2], [0, 2], [2, 2], [-2, -2]], "type": "LineString"}, "id": 456, "properties": {"building": "apartments"}, "type": "Feature"}
]}
with open('/tmp/foo.json', 'w') as f: json.dump(data, f)
gpd.read_file('/tmp/foo.json')
问题是 id 被简单地丢弃了,而是使用自动递增 RangeIndex
。
building geometry
0 house LINESTRING (-1.00000 -1.00000, 0.00000 1.00000...
1 apartments LINESTRING (-2.00000 -2.00000, 0.00000 2.00000...
能否请您指教如何以优雅的方式解决这种情况?我是否应该像这样收集 ID 并手动设置索引:
gdf.index = [x['id'] for x in data['features']]
您可以使用pandas json_normalize()
从geojson和set_index()
中提取到设置它。
import geopandas as gpd
import json
import pandas as pd
data = {"type": "FeatureCollection", "features": [
{"geometry": {"coordinates": [[-1, -1], [0, 1], [1, 1], [-1, -1]], "type": "LineString"}, "id": 123, "properties": {"building": "house"}, "type": "Feature"},
{"geometry": {"coordinates": [[-2, -2], [0, 2], [2, 2], [-2, -2]], "type": "LineString"}, "id": 456, "properties": {"building": "apartments"}, "type": "Feature"}
]}
gpd.GeoDataFrame.from_features(data).set_index(pd.json_normalize(data["features"])["id"].values)
geometry | building | |
---|---|---|
123 | LINESTRING (-1 -1, 0 1, 1 1, -1 -1) | house |
456 | LINESTRING (-2 -2, 0 2, 2 2, -2 -2) | apartments |