如何针对特定国家过滤 GeoJson 文件?
How do I filter a GeoJson file for specific countries?
挑战:我正在尝试从我的 geojson 字典创建一个新字典,该字典仅针对感兴趣的国家进行过滤,因为原始 geojson 文件对于可视化来说太大了。
我有一个具有以下形式的 geojson 文件,我创建了一个空字典来复制它:
newData = {'features': {},
'properties':{'ADMIN':"",
'ISO_A3':"",
},
'geometry':{'type':"",
'coordinates':""
},
'id':""
}
以下是 geojson 文件中元素之一的示例:
data['features'][3]
{'type': 'Feature',
'properties': {'ADMIN': 'Aruba', 'ISO_A3': 'ABW'},
'geometry': {'type': 'Polygon',
'coordinates': [[[-69.99693762899992, 12.577582098000036],
[-69.93639075399994, 12.53172435100005],
[-69.92467200399994, 12.519232489000046],
[-69.91576087099992, 12.497015692000076],
[-69.88019771999984, 12.453558661000045],
[-69.87682044199994, 12.427394924000097],
[-69.88809160099993, 12.417669989000046],
[-69.90880286399994, 12.417792059000107],
[-69.93053137899989, 12.425970770000035],
[-69.94513912699992, 12.44037506700009],
[-69.92467200399994, 12.44037506700009],
[-69.92467200399994, 12.447211005000014],
[-69.95856686099992, 12.463202216000099],
[-70.02765865799992, 12.522935289000088],
[-70.04808508999989, 12.53115469000008],
[-70.05809485599988, 12.537176825000088],
[-70.06240800699987, 12.546820380000057],
[-70.06037350199995, 12.556952216000113],
[-70.0510961579999, 12.574042059000064],
[-70.04873613199993, 12.583726304000024],
[-70.05264238199993, 12.600002346000053],
[-70.05964107999992, 12.614243882000054],
[-70.06110592399997, 12.625392971000068],
[-70.04873613199993, 12.632147528000104],
[-70.00715084499987, 12.5855166690001],
[-69.99693762899992, 12.577582098000036]]]},
'id': 'ABW'}
我还有一个我真正有兴趣分析的国家/地区的数据框对象:
df_Country.head()
2 Italy
3 Spain
4 Portugal
5 United Arab Emirates
6 Egypt
此文件中有一些国家/地区对于我正在执行的分析来说是不必要的,因此我想将它们过滤掉。我相信这类似于过滤嵌套字典。为此,我尝试创建一个空字典并循环遍历它,每当我与 df_Countries 匹配时添加 geo_data 的值。以下是我的尝试:
for i in range(len(data['features'])):
if data['features'][i]['properties']['ADMIN'] in df_Country:
newData['properties']['ADMIN'] = data['features'][i]['properties']['ADMIN']
newData['properties']['ISO_A3'] = data['features'][i]['properties']['ISO_A3']
newData['geometry']['type'] = data['features'][i]['geometry']['type']
newData['geometry']['coordinates'] = data['features'][i]['geometry']['coordinates']
newData['id'] = data['features'][i]['id']
最后,我的 newData 字典仍然是空的。有什么想法吗?提前致谢!
你们真的很亲密!您可以像这样进行单行列表理解:
# example data
geo_json = [
{'type': 'Feature',
'properties': {'ADMIN': 'Italy', 'ISO_A3': 'ABW'},
'geometry': {'type': 'Polygon',
'coordinates': [[[-69.99693762899992, 12.577582098000036],
[-69.99693762899992, 12.577582098000036]]]},
'id': 'ABW'},
{'type': 'Feature',
'properties': {'ADMIN': 'Aruba', 'ISO_A3': 'ABW'},
'geometry': {'type': 'Polygon',
'coordinates': [[[-69.99693762899992, 12.577582098000036],
[-69.99693762899992, 12.577582098000036]]]},
'id': 'ABW'},
{'type': 'Feature',
'properties': {'ADMIN': 'Spain', 'ISO_A3': 'ABW'},
'geometry': {'type': 'Polygon',
'coordinates': [[[-69.99693762899992, 12.577582098000036],
[-69.99693762899992, 12.577582098000036]]]},
'id': 'ABW'},
]
# countries you want
countries = ['Italy', 'Spain']
# new list of geo_json but only ones with ['properties']['ADMIN'] in countries
filtered = [geo for geo in geo_json if geo['properties']['ADMIN'] in countries]
# pretty print the results
from pprint import pprint
pprint(filtered)
该理解的可比较 for
循环看起来像:
filtered = []
for geo in geo_json:
if geo['properties']['ADMIN'] in countries:
filtered.append(geo)
输出(仅西班牙和意大利,geo_json
中有 3 个):
[{'geometry': {'coordinates': [[[-69.99693762899992, 12.577582098000036],
[-69.99693762899992, 12.577582098000036]]],
'type': 'Polygon'},
'id': 'ABW',
'properties': {'ADMIN': 'Italy', 'ISO_A3': 'ABW'},
'type': 'Feature'},
{'geometry': {'coordinates': [[[-69.99693762899992, 12.577582098000036],
[-69.99693762899992, 12.577582098000036]]],
'type': 'Polygon'},
'id': 'ABW',
'properties': {'ADMIN': 'Spain', 'ISO_A3': 'ABW'},
'type': 'Feature'}]
挑战:我正在尝试从我的 geojson 字典创建一个新字典,该字典仅针对感兴趣的国家进行过滤,因为原始 geojson 文件对于可视化来说太大了。
我有一个具有以下形式的 geojson 文件,我创建了一个空字典来复制它:
newData = {'features': {},
'properties':{'ADMIN':"",
'ISO_A3':"",
},
'geometry':{'type':"",
'coordinates':""
},
'id':""
}
以下是 geojson 文件中元素之一的示例:
data['features'][3]
{'type': 'Feature',
'properties': {'ADMIN': 'Aruba', 'ISO_A3': 'ABW'},
'geometry': {'type': 'Polygon',
'coordinates': [[[-69.99693762899992, 12.577582098000036],
[-69.93639075399994, 12.53172435100005],
[-69.92467200399994, 12.519232489000046],
[-69.91576087099992, 12.497015692000076],
[-69.88019771999984, 12.453558661000045],
[-69.87682044199994, 12.427394924000097],
[-69.88809160099993, 12.417669989000046],
[-69.90880286399994, 12.417792059000107],
[-69.93053137899989, 12.425970770000035],
[-69.94513912699992, 12.44037506700009],
[-69.92467200399994, 12.44037506700009],
[-69.92467200399994, 12.447211005000014],
[-69.95856686099992, 12.463202216000099],
[-70.02765865799992, 12.522935289000088],
[-70.04808508999989, 12.53115469000008],
[-70.05809485599988, 12.537176825000088],
[-70.06240800699987, 12.546820380000057],
[-70.06037350199995, 12.556952216000113],
[-70.0510961579999, 12.574042059000064],
[-70.04873613199993, 12.583726304000024],
[-70.05264238199993, 12.600002346000053],
[-70.05964107999992, 12.614243882000054],
[-70.06110592399997, 12.625392971000068],
[-70.04873613199993, 12.632147528000104],
[-70.00715084499987, 12.5855166690001],
[-69.99693762899992, 12.577582098000036]]]},
'id': 'ABW'}
我还有一个我真正有兴趣分析的国家/地区的数据框对象:
df_Country.head()
2 Italy
3 Spain
4 Portugal
5 United Arab Emirates
6 Egypt
此文件中有一些国家/地区对于我正在执行的分析来说是不必要的,因此我想将它们过滤掉。我相信这类似于过滤嵌套字典。为此,我尝试创建一个空字典并循环遍历它,每当我与 df_Countries 匹配时添加 geo_data 的值。以下是我的尝试:
for i in range(len(data['features'])):
if data['features'][i]['properties']['ADMIN'] in df_Country:
newData['properties']['ADMIN'] = data['features'][i]['properties']['ADMIN']
newData['properties']['ISO_A3'] = data['features'][i]['properties']['ISO_A3']
newData['geometry']['type'] = data['features'][i]['geometry']['type']
newData['geometry']['coordinates'] = data['features'][i]['geometry']['coordinates']
newData['id'] = data['features'][i]['id']
最后,我的 newData 字典仍然是空的。有什么想法吗?提前致谢!
你们真的很亲密!您可以像这样进行单行列表理解:
# example data
geo_json = [
{'type': 'Feature',
'properties': {'ADMIN': 'Italy', 'ISO_A3': 'ABW'},
'geometry': {'type': 'Polygon',
'coordinates': [[[-69.99693762899992, 12.577582098000036],
[-69.99693762899992, 12.577582098000036]]]},
'id': 'ABW'},
{'type': 'Feature',
'properties': {'ADMIN': 'Aruba', 'ISO_A3': 'ABW'},
'geometry': {'type': 'Polygon',
'coordinates': [[[-69.99693762899992, 12.577582098000036],
[-69.99693762899992, 12.577582098000036]]]},
'id': 'ABW'},
{'type': 'Feature',
'properties': {'ADMIN': 'Spain', 'ISO_A3': 'ABW'},
'geometry': {'type': 'Polygon',
'coordinates': [[[-69.99693762899992, 12.577582098000036],
[-69.99693762899992, 12.577582098000036]]]},
'id': 'ABW'},
]
# countries you want
countries = ['Italy', 'Spain']
# new list of geo_json but only ones with ['properties']['ADMIN'] in countries
filtered = [geo for geo in geo_json if geo['properties']['ADMIN'] in countries]
# pretty print the results
from pprint import pprint
pprint(filtered)
该理解的可比较 for
循环看起来像:
filtered = []
for geo in geo_json:
if geo['properties']['ADMIN'] in countries:
filtered.append(geo)
输出(仅西班牙和意大利,geo_json
中有 3 个):
[{'geometry': {'coordinates': [[[-69.99693762899992, 12.577582098000036],
[-69.99693762899992, 12.577582098000036]]],
'type': 'Polygon'},
'id': 'ABW',
'properties': {'ADMIN': 'Italy', 'ISO_A3': 'ABW'},
'type': 'Feature'},
{'geometry': {'coordinates': [[[-69.99693762899992, 12.577582098000036],
[-69.99693762899992, 12.577582098000036]]],
'type': 'Polygon'},
'id': 'ABW',
'properties': {'ADMIN': 'Spain', 'ISO_A3': 'ABW'},
'type': 'Feature'}]