使用特征中的 id 作为 geopandas 数据帧索引

Question

我想要实现的是将 geojson 文件转换为 GeoDataFrame，同时将特征“id”移动到索引中。示例代码：

import geopandas as gpd
import json

data = {"type": "FeatureCollection", "features": [
  {"geometry": {"coordinates": [[-1, -1], [0, 1], [1, 1], [-1, -1]], "type": "LineString"}, "id": 123, "properties": {"building": "house"}, "type": "Feature"},
  {"geometry": {"coordinates": [[-2, -2], [0, 2], [2, 2], [-2, -2]], "type": "LineString"}, "id": 456, "properties": {"building": "apartments"}, "type": "Feature"}
]}

with open('/tmp/foo.json', 'w') as f: json.dump(data, f)

gpd.read_file('/tmp/foo.json')

问题是 id 被简单地丢弃了，而是使用自动递增 RangeIndex。

     building                                           geometry
0       house  LINESTRING (-1.00000 -1.00000, 0.00000 1.00000...
1  apartments  LINESTRING (-2.00000 -2.00000, 0.00000 2.00000...

能否请您指教如何以优雅的方式解决这种情况？我是否应该像这样收集 ID 并手动设置索引：

gdf.index = [x['id'] for x in data['features']]

Answer 1

您可以使用pandas json_normalize()从geojson和set_index()中提取到设置它。

import geopandas as gpd
import json
import pandas as pd

data = {"type": "FeatureCollection", "features": [
  {"geometry": {"coordinates": [[-1, -1], [0, 1], [1, 1], [-1, -1]], "type": "LineString"}, "id": 123, "properties": {"building": "house"}, "type": "Feature"},
  {"geometry": {"coordinates": [[-2, -2], [0, 2], [2, 2], [-2, -2]], "type": "LineString"}, "id": 456, "properties": {"building": "apartments"}, "type": "Feature"}
]}

gpd.GeoDataFrame.from_features(data).set_index(pd.json_normalize(data["features"])["id"].values)

	geometry	building
123	LINESTRING (-1 -1, 0 1, 1 1, -1 -1)	house
456	LINESTRING (-2 -2, 0 2, 2 2, -2 -2)	apartments

使用特征中的 id 作为 geopandas 数据帧索引

use id from feature as geopandas dataframe index

python

indexing

geojson

geopandas