批量处理数据框中的多个纬度和经度值,并根据这些响应创建一个新列
Bulk process a number of latitude and longitude values from a dataframe and make a new column from those responses
在我的 Dataframe 中,我有两列纬度和经度。
我想使用这两列来计算我的 test_url 列以获取其中的国家/地区。
为此,我正在使用 Nominatim OpenStreetMap api url。
我的导入:
import pandas as pd
import requests
我的check_country函数:
def check_country(url):
r = requests.get(url)
results = r.json()['address']
return results['country']
列计算:
df['test_url'] = df[['latitude','longitude']].apply(lambda x : check_country(f"https://nominatim.openstreetmap.org/reverse?lat={x[0]}&lon={x[1]}&format=json"),axis=1)
但是我遇到了连接错误。
错误
ConnectionError:
HTTPSConnectionPool(host='nominatim.openstreetmap.org', port=443): Max retries exceeded with url: /reverse?lat=10.75161&lon=77.11299&format=json
(Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000002262FC94C40>:
Failed to establish a new connection:
[WinError 10061] No connection could be made because the target machine actively refused it'
))
您可以使用 GeoPandas
and use the "World Administrative Boundaries" dataset to make local requests. First step is to ownload the GeoJSON 文件并安装 geopandas
然后:
# Python env: pip install geopandas
# Anaconda env: conda install geopandas
import geopandas as gpd
from shapely.geometry import Point
gdf = gpd.read_file('world-administrative-boundaries.geojson')
p = Point(77.11299, 10.75161)
out = gdf.loc[gdf.intersects(p), 'name']
print(out)
# Output:
226 India
Name: name, dtype: object
高级用法:多个坐标:
coords = [(40.730610, -73.935242), (10.75161, 77.11299)]
points = [Point(lon, lat) for lat, lon in coords]
dfp = gpd.GeoDataFrame({'geometry': points}, crs=gdf.crs)
out = gpd.sjoin(dfp, gdf, predicate='within')
print(out)
# Output
geometry index_right french_short iso3 status iso_3166_1_alpha_2_codes name region color_code continent
0 POINT (-73.93524 40.73061) 182 États-Unis d'Amérique USA Member State US United States of America Northern America USA Americas
1 POINT (77.11299 10.75161) 226 Inde IND Member State IN India Southern Asia IND Asia
在我的 Dataframe 中,我有两列纬度和经度。 我想使用这两列来计算我的 test_url 列以获取其中的国家/地区。
为此,我正在使用 Nominatim OpenStreetMap api url。
我的导入:
import pandas as pd
import requests
我的check_country函数:
def check_country(url):
r = requests.get(url)
results = r.json()['address']
return results['country']
列计算:
df['test_url'] = df[['latitude','longitude']].apply(lambda x : check_country(f"https://nominatim.openstreetmap.org/reverse?lat={x[0]}&lon={x[1]}&format=json"),axis=1)
但是我遇到了连接错误。
错误
ConnectionError:
HTTPSConnectionPool(host='nominatim.openstreetmap.org', port=443): Max retries exceeded with url: /reverse?lat=10.75161&lon=77.11299&format=json
(Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000002262FC94C40>:
Failed to establish a new connection:
[WinError 10061] No connection could be made because the target machine actively refused it'
))
您可以使用 GeoPandas
and use the "World Administrative Boundaries" dataset to make local requests. First step is to ownload the GeoJSON 文件并安装 geopandas
然后:
# Python env: pip install geopandas
# Anaconda env: conda install geopandas
import geopandas as gpd
from shapely.geometry import Point
gdf = gpd.read_file('world-administrative-boundaries.geojson')
p = Point(77.11299, 10.75161)
out = gdf.loc[gdf.intersects(p), 'name']
print(out)
# Output:
226 India
Name: name, dtype: object
高级用法:多个坐标:
coords = [(40.730610, -73.935242), (10.75161, 77.11299)]
points = [Point(lon, lat) for lat, lon in coords]
dfp = gpd.GeoDataFrame({'geometry': points}, crs=gdf.crs)
out = gpd.sjoin(dfp, gdf, predicate='within')
print(out)
# Output
geometry index_right french_short iso3 status iso_3166_1_alpha_2_codes name region color_code continent
0 POINT (-73.93524 40.73061) 182 États-Unis d'Amérique USA Member State US United States of America Northern America USA Americas
1 POINT (77.11299 10.75161) 226 Inde IND Member State IN India Southern Asia IND Asia