如何防止 Geopy 上出现此速率限制器错误?
How Do I Prevent This Rate Limiter Error on Geopy?
我有一个充满英国邮政编码的数据框。我有大约 400 行,想要获取这些邮政编码的地理编码,以便日后绘制它们。我使用了以下指南,所以也不确定是什么导致了错误:
https://practicaldatascience.co.uk/data-science/how-to-geocode-and-map-addresses-in-geopy
我得到了以下代码。我使用的数据框只是一个 1 列长的数据框,其中包含来自虚拟数据集的英国邮政编码。
import pandas as pd
import folium
import geopy
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter
geocoder = RateLimiter(Nominatim(user_agent='Get_Lat_Longs').geocode, min_delay_seconds=1)
df = pd.read_excel('Postcodes.xls', sheet_name='Addresses formatted')
df_copy = df.copy()
df_postcodes = df_copy['Postcode'].to_frame()
df_postcodes['Geocode'] = df_postcodes['Postcode'].apply(geocoder)
但是,我收到以下错误,我不太确定如何调试我所做的,我们将不胜感激。
RateLimiter caught an error, retrying (0/2 tries). Called with (*('N20 0PE',), **{}).
Traceback (most recent call last):
File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 696, in urlopen
self._prepare_proxy(conn)
File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 964, in _prepare_proxy
conn.connect()
File "c:\users\np\env\lib\site-packages\urllib3\connection.py", line 364, in connect
conn = self._connect_tls_proxy(hostname, conn)
File "c:\users\np\env\lib\site-packages\urllib3\connection.py", line 507, in _connect_tls_proxy
ssl_context=ssl_context,
File "c:\users\np\env\lib\site-packages\urllib3\util\ssl_.py", line 453, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)
File "c:\users\np\env\lib\site-packages\urllib3\util\ssl_.py", line 495, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock)
File "C:\Program Files\Python37\lib\ssl.py", line 423, in wrap_socket
session=session
File "C:\Program Files\Python37\lib\ssl.py", line 870, in _create
self.do_handshake()
File "C:\Program Files\Python37\lib\ssl.py", line 1139, in do_handshake
self._sslobj.do_handshake()
socket.timeout: _ssl.c:1074: The handshake operation timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\users\np\env\lib\site-packages\requests\adapters.py", line 449, in send
timeout=timeout
File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 796, in urlopen
**response_kw
File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 796, in urlopen
**response_kw
File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 756, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "c:\users\np\env\lib\site-packages\urllib3\util\retry.py", line 574, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='nominatim.openstreetmap.org', port=443): Max retries exceeded with url: /search?q=N20+0PE&format=json&limit=1 (Caused by ProxyError('Cannot connect to proxy.', timeout('_ssl.c:1074: The handshake operation timed out')))
问题是我试图在虚拟机中执行此操作。检查给出的评论后,我能够确定在虚拟机内部,请求没有发送到网站,但是在我的本地机器上,这不是问题,我能够获得所有内容的地理编码.
我有一个充满英国邮政编码的数据框。我有大约 400 行,想要获取这些邮政编码的地理编码,以便日后绘制它们。我使用了以下指南,所以也不确定是什么导致了错误:
https://practicaldatascience.co.uk/data-science/how-to-geocode-and-map-addresses-in-geopy
我得到了以下代码。我使用的数据框只是一个 1 列长的数据框,其中包含来自虚拟数据集的英国邮政编码。
import pandas as pd
import folium
import geopy
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter
geocoder = RateLimiter(Nominatim(user_agent='Get_Lat_Longs').geocode, min_delay_seconds=1)
df = pd.read_excel('Postcodes.xls', sheet_name='Addresses formatted')
df_copy = df.copy()
df_postcodes = df_copy['Postcode'].to_frame()
df_postcodes['Geocode'] = df_postcodes['Postcode'].apply(geocoder)
但是,我收到以下错误,我不太确定如何调试我所做的,我们将不胜感激。
RateLimiter caught an error, retrying (0/2 tries). Called with (*('N20 0PE',), **{}).
Traceback (most recent call last):
File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 696, in urlopen
self._prepare_proxy(conn)
File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 964, in _prepare_proxy
conn.connect()
File "c:\users\np\env\lib\site-packages\urllib3\connection.py", line 364, in connect
conn = self._connect_tls_proxy(hostname, conn)
File "c:\users\np\env\lib\site-packages\urllib3\connection.py", line 507, in _connect_tls_proxy
ssl_context=ssl_context,
File "c:\users\np\env\lib\site-packages\urllib3\util\ssl_.py", line 453, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)
File "c:\users\np\env\lib\site-packages\urllib3\util\ssl_.py", line 495, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock)
File "C:\Program Files\Python37\lib\ssl.py", line 423, in wrap_socket
session=session
File "C:\Program Files\Python37\lib\ssl.py", line 870, in _create
self.do_handshake()
File "C:\Program Files\Python37\lib\ssl.py", line 1139, in do_handshake
self._sslobj.do_handshake()
socket.timeout: _ssl.c:1074: The handshake operation timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\users\np\env\lib\site-packages\requests\adapters.py", line 449, in send
timeout=timeout
File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 796, in urlopen
**response_kw
File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 796, in urlopen
**response_kw
File "c:\users\np\env\lib\site-packages\urllib3\connectionpool.py", line 756, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "c:\users\np\env\lib\site-packages\urllib3\util\retry.py", line 574, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='nominatim.openstreetmap.org', port=443): Max retries exceeded with url: /search?q=N20+0PE&format=json&limit=1 (Caused by ProxyError('Cannot connect to proxy.', timeout('_ssl.c:1074: The handshake operation timed out')))
问题是我试图在虚拟机中执行此操作。检查给出的评论后,我能够确定在虚拟机内部,请求没有发送到网站,但是在我的本地机器上,这不是问题,我能够获得所有内容的地理编码.