通过 geopy 从 pandas dataframe 获取所有 long 和 lat
getting all long and lat from pandas dataframe via geopy
我有一个包含位置数据的数据框,例如 "Los Angeles, CA"。
目标是迭代列的所有条目并将经度和纬度保存在新列中。非常欢迎任何意见或提示!
我尝试了一个单一的值,它成功了。
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="xxx")
location=geolocator.geocode(df['Location'][1])
print(location.longitude)
print(location.latitude)
-117.8704931
33.7500378
现在,作为初学者,我认为让我们做一个 for 循环:
df['lat']=0
print(df['Location'][1])
for x in range(1,len(df)+1):
location = geolocator.geocode(df['Location'][x])
df['loc'][x]=location.latitude
我收到以下警告:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
df['loc'][x]=location.latitude
约 2 分钟后出现以下错误:
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\skpok\Anaconda3\lib\site-packages\geopy\geocoders\base.py", line 344, in _call_geocoder
page = requester(req, timeout=timeout, **kwargs)
File "C:\Users\skpok\Anaconda3\lib\urllib\request.py", line 525, in open
response = self._open(req, data)
File "C:\Users\skpok\Anaconda3\lib\urllib\request.py", line 543, in _open
'_open', req)
File "C:\Users\skpok\Anaconda3\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\skpok\Anaconda3\lib\urllib\request.py", line 1360, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\Users\skpok\Anaconda3\lib\urllib\request.py", line 1319, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error timed out>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\skpok\Downloads\test123.py", line 12, in <module>
location = geolocator.geocode(df['Location'][x])
File "C:\Users\skpok\Anaconda3\lib\site-packages\geopy\geocoders\osm.py", line 309, in geocode
self._call_geocoder(url, timeout=timeout), exactly_one
File "C:\Users\skpok\Anaconda3\lib\site-packages\geopy\geocoders\base.py", line 367, in _call_geocoder
raise GeocoderTimedOut('Service timed out')
geopy.exc.GeocoderTimedOut: Service timed out
Nominatim 有每秒最大请求数规则(每秒 1 个)。您应该尝试在脚本在循环之间休眠的位置添加一个子句。
你可以试试
import time
df['lat']=0
print(df['Location'][1])
for x in range(1,len(df)+1):
location = geolocator.geocode(df['Location'][x])
time.sleep(2)
df.at[x, 'lat']=location.latitude
我有一个包含位置数据的数据框,例如 "Los Angeles, CA"。
目标是迭代列的所有条目并将经度和纬度保存在新列中。非常欢迎任何意见或提示!
我尝试了一个单一的值,它成功了。
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="xxx")
location=geolocator.geocode(df['Location'][1])
print(location.longitude)
print(location.latitude)
-117.8704931
33.7500378
现在,作为初学者,我认为让我们做一个 for 循环:
df['lat']=0
print(df['Location'][1])
for x in range(1,len(df)+1):
location = geolocator.geocode(df['Location'][x])
df['loc'][x]=location.latitude
我收到以下警告:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
df['loc'][x]=location.latitude
约 2 分钟后出现以下错误:
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\skpok\Anaconda3\lib\site-packages\geopy\geocoders\base.py", line 344, in _call_geocoder
page = requester(req, timeout=timeout, **kwargs)
File "C:\Users\skpok\Anaconda3\lib\urllib\request.py", line 525, in open
response = self._open(req, data)
File "C:\Users\skpok\Anaconda3\lib\urllib\request.py", line 543, in _open
'_open', req)
File "C:\Users\skpok\Anaconda3\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\skpok\Anaconda3\lib\urllib\request.py", line 1360, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\Users\skpok\Anaconda3\lib\urllib\request.py", line 1319, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error timed out>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\skpok\Downloads\test123.py", line 12, in <module>
location = geolocator.geocode(df['Location'][x])
File "C:\Users\skpok\Anaconda3\lib\site-packages\geopy\geocoders\osm.py", line 309, in geocode
self._call_geocoder(url, timeout=timeout), exactly_one
File "C:\Users\skpok\Anaconda3\lib\site-packages\geopy\geocoders\base.py", line 367, in _call_geocoder
raise GeocoderTimedOut('Service timed out')
geopy.exc.GeocoderTimedOut: Service timed out
Nominatim 有每秒最大请求数规则(每秒 1 个)。您应该尝试在脚本在循环之间休眠的位置添加一个子句。
你可以试试
import time
df['lat']=0
print(df['Location'][1])
for x in range(1,len(df)+1):
location = geolocator.geocode(df['Location'][x])
time.sleep(2)
df.at[x, 'lat']=location.latitude