如何针对特定国家过滤 GeoJson 文件？

Question

挑战：我正在尝试从我的 geojson 字典创建一个新字典，该字典仅针对感兴趣的国家进行过滤，因为原始 geojson 文件对于可视化来说太大了。

我有一个具有以下形式的 geojson 文件，我创建了一个空字典来复制它：

newData = {'features': {},
           'properties':{'ADMIN':"",
                         'ISO_A3':"",
                         },
           'geometry':{'type':"",
                       'coordinates':""
                       },
           'id':""
           }

以下是 geojson 文件中元素之一的示例：

data['features'][3]

{'type': 'Feature',
 'properties': {'ADMIN': 'Aruba', 'ISO_A3': 'ABW'},
 'geometry': {'type': 'Polygon',
  'coordinates': [[[-69.99693762899992, 12.577582098000036],
    [-69.93639075399994, 12.53172435100005],
    [-69.92467200399994, 12.519232489000046],
    [-69.91576087099992, 12.497015692000076],
    [-69.88019771999984, 12.453558661000045],
    [-69.87682044199994, 12.427394924000097],
    [-69.88809160099993, 12.417669989000046],
    [-69.90880286399994, 12.417792059000107],
    [-69.93053137899989, 12.425970770000035],
    [-69.94513912699992, 12.44037506700009],
    [-69.92467200399994, 12.44037506700009],
    [-69.92467200399994, 12.447211005000014],
    [-69.95856686099992, 12.463202216000099],
    [-70.02765865799992, 12.522935289000088],
    [-70.04808508999989, 12.53115469000008],
    [-70.05809485599988, 12.537176825000088],
    [-70.06240800699987, 12.546820380000057],
    [-70.06037350199995, 12.556952216000113],
    [-70.0510961579999, 12.574042059000064],
    [-70.04873613199993, 12.583726304000024],
    [-70.05264238199993, 12.600002346000053],
    [-70.05964107999992, 12.614243882000054],
    [-70.06110592399997, 12.625392971000068],
    [-70.04873613199993, 12.632147528000104],
    [-70.00715084499987, 12.5855166690001],
    [-69.99693762899992, 12.577582098000036]]]},
 'id': 'ABW'}

我还有一个我真正有兴趣分析的国家/地区的数据框对象：

df_Country.head()
2                   Italy
3                   Spain
4                Portugal
5    United Arab Emirates
6                   Egypt

此文件中有一些国家/地区对于我正在执行的分析来说是不必要的，因此我想将它们过滤掉。我相信这类似于过滤嵌套字典。为此，我尝试创建一个空字典并循环遍历它，每当我与 df_Countries 匹配时添加 geo_data 的值。以下是我的尝试：

for i in range(len(data['features'])):
  if data['features'][i]['properties']['ADMIN'] in df_Country:
    newData['properties']['ADMIN'] = data['features'][i]['properties']['ADMIN']
    newData['properties']['ISO_A3'] = data['features'][i]['properties']['ISO_A3']
    newData['geometry']['type'] = data['features'][i]['geometry']['type']
    newData['geometry']['coordinates'] = data['features'][i]['geometry']['coordinates']
    newData['id'] = data['features'][i]['id']

最后，我的 newData 字典仍然是空的。有什么想法吗？提前致谢！

Answer 1

你们真的很亲密！您可以像这样进行单行列表理解：

# example data
geo_json = [
    {'type': 'Feature',
     'properties': {'ADMIN': 'Italy', 'ISO_A3': 'ABW'},
     'geometry': {'type': 'Polygon',
                  'coordinates': [[[-69.99693762899992, 12.577582098000036],
                                   [-69.99693762899992, 12.577582098000036]]]},
        'id': 'ABW'},
    {'type': 'Feature',
     'properties': {'ADMIN': 'Aruba', 'ISO_A3': 'ABW'},
     'geometry': {'type': 'Polygon',
                  'coordinates': [[[-69.99693762899992, 12.577582098000036],
                                   [-69.99693762899992, 12.577582098000036]]]},
        'id': 'ABW'},
    {'type': 'Feature',
     'properties': {'ADMIN': 'Spain', 'ISO_A3': 'ABW'},
     'geometry': {'type': 'Polygon',
                  'coordinates': [[[-69.99693762899992, 12.577582098000036],
                                   [-69.99693762899992, 12.577582098000036]]]},
        'id': 'ABW'},
]

# countries you want
countries = ['Italy', 'Spain']

# new list of geo_json but only ones with ['properties']['ADMIN'] in countries
filtered = [geo for geo in geo_json if geo['properties']['ADMIN'] in countries]

# pretty print the results
from pprint import pprint
pprint(filtered)

该理解的可比较 for 循环看起来像：

filtered = []
for geo in geo_json:
    if geo['properties']['ADMIN'] in countries:
        filtered.append(geo)

输出（仅西班牙和意大利，geo_json中有 3 个）：

[{'geometry': {'coordinates': [[[-69.99693762899992, 12.577582098000036],  
                                [-69.99693762899992, 12.577582098000036]]],
               'type': 'Polygon'},
  'id': 'ABW',
  'properties': {'ADMIN': 'Italy', 'ISO_A3': 'ABW'},
  'type': 'Feature'},
 {'geometry': {'coordinates': [[[-69.99693762899992, 12.577582098000036],
                                [-69.99693762899992, 12.577582098000036]]],
               'type': 'Polygon'},
  'id': 'ABW',
  'properties': {'ADMIN': 'Spain', 'ISO_A3': 'ABW'},
  'type': 'Feature'}]

如何针对特定国家过滤 GeoJson 文件？

How do I filter a GeoJson file for specific countries?

python

dictionary

nested

filter

geojson