Python 获取 Geopy 中每个经纬度坐标的州区和州列
Python get state district and state column for each lat long coordinate in Geopy
我有一个包含 200 多个纬度和经度坐标对的列表。
对于每个坐标对,我想创建一个包含列区和列状态的数据框。所以我的数据框将有 3 列 cord, district and state
。
为此,我正在使用 geopy 库,但我无法获得超过 115 个坐标的记录。
示例数据
cord
0 (19.4, 17.93)
1 (55.54, 93.93)
2 (52.45, 78.93)
3 (65.54, 67.93)
4 (47.74, 99.93)
需要输出演示
cord district state
0 (19.4, 17.93) xyz aaa
1 (55.54, 93.93) adc aaa
2 (52.45, 78.93) gyu drt
3 (65.54, 67.93) www bhn
4 (47.74, 99.93) ccf bvg
我试过这段代码,但无法获取超过 115 个查询的详细信息。
from geopy.geocoders import Nominatim
district = {} # Initialize empty dict
geo_loc # List containing all the codrinates in this format (lat, long)
for cord in geo_loc:
geolocator = Nominatim(user_agent='user_agent')
location = geolocator.reverse(cord, addressdetails=True)
district[cord] = location.raw['address']['state_district']
I need to fetch maximum of 500 unique coordinates at one time.
Also I need district and state name both in separate
columns.
从 Nominatim Usage Policy 他们要求不要大量使用,即每秒最多 1 个请求。 “没有大量使用(每秒绝对最多 1 个请求)。”
您可以使用 geopy 的 RateLimiter 每秒发送 1 个请求。
我已经测试了以下代码适用于超过 115 个请求:
from geopy.extra.rate_limiter import RateLimiter
from geopy.geocoders import Nominatim
import pandas as pd
geolocator = Nominatim(user_agent="user_agent")
# add rate limit
reverse = RateLimiter(geolocator.reverse, min_delay_seconds=1)
state_list = [] # Initialize empty dict
# create dataframe
df = pd.DataFrame({"geo_loc" :[(19.4, 17.93), (55.54, 93.93),(52.45, 78.93), (65.54, 67.93), (47.74, 99.93) ]})
# get location coordinates
geo_loc = df.geo_loc.values
for cord in geo_loc:
# send request
location = reverse(cord, addressdetails=True)
# get state value
state = location.raw["address"].get("state")
# store state value
state_list.append(state)
# assign back states
df['states'] = state_list
print(df)
结果数据帧:
geo_loc states
0 (19.4, 17.93) Tibesti تيبستي
1 (55.54, 93.93) Красноярский край
2 (52.45, 78.93) Алтайский край
3 (65.54, 67.93) Ямало-Ненецкий автономный округ
4 (47.74, 99.93) Архангай
我有一个包含 200 多个纬度和经度坐标对的列表。
对于每个坐标对,我想创建一个包含列区和列状态的数据框。所以我的数据框将有 3 列 cord, district and state
。
为此,我正在使用 geopy 库,但我无法获得超过 115 个坐标的记录。
示例数据
cord
0 (19.4, 17.93)
1 (55.54, 93.93)
2 (52.45, 78.93)
3 (65.54, 67.93)
4 (47.74, 99.93)
需要输出演示
cord district state
0 (19.4, 17.93) xyz aaa
1 (55.54, 93.93) adc aaa
2 (52.45, 78.93) gyu drt
3 (65.54, 67.93) www bhn
4 (47.74, 99.93) ccf bvg
我试过这段代码,但无法获取超过 115 个查询的详细信息。
from geopy.geocoders import Nominatim
district = {} # Initialize empty dict
geo_loc # List containing all the codrinates in this format (lat, long)
for cord in geo_loc:
geolocator = Nominatim(user_agent='user_agent')
location = geolocator.reverse(cord, addressdetails=True)
district[cord] = location.raw['address']['state_district']
I need to fetch maximum of 500 unique coordinates at one time.
Also I need district and state name both in separate columns.
从 Nominatim Usage Policy 他们要求不要大量使用,即每秒最多 1 个请求。 “没有大量使用(每秒绝对最多 1 个请求)。” 您可以使用 geopy 的 RateLimiter 每秒发送 1 个请求。 我已经测试了以下代码适用于超过 115 个请求:
from geopy.extra.rate_limiter import RateLimiter
from geopy.geocoders import Nominatim
import pandas as pd
geolocator = Nominatim(user_agent="user_agent")
# add rate limit
reverse = RateLimiter(geolocator.reverse, min_delay_seconds=1)
state_list = [] # Initialize empty dict
# create dataframe
df = pd.DataFrame({"geo_loc" :[(19.4, 17.93), (55.54, 93.93),(52.45, 78.93), (65.54, 67.93), (47.74, 99.93) ]})
# get location coordinates
geo_loc = df.geo_loc.values
for cord in geo_loc:
# send request
location = reverse(cord, addressdetails=True)
# get state value
state = location.raw["address"].get("state")
# store state value
state_list.append(state)
# assign back states
df['states'] = state_list
print(df)
结果数据帧:
geo_loc states
0 (19.4, 17.93) Tibesti تيبستي
1 (55.54, 93.93) Красноярский край
2 (52.45, 78.93) Алтайский край
3 (65.54, 67.93) Ямало-Ненецкий автономный округ
4 (47.74, 99.93) Архангай