接下来在 Iterator 中优雅地捕获 python 异常
Gracefully catch python Exception in Iterator next
我正在使用 PyGithub 抓取一些存储库,尽管我在遍历搜索页面时遇到了一些错误。
def scrape_interval(self, interval):
for repo_number, repo in self.search(interval):
code...
def search(self, interval):
try:
iterator = enumerate(self.github.search_repositories(query="Laravel created:" + interval))
except:
print.warning("Going to sleep for 1 hour. The search API hit the limit")
time.sleep(3600)
iterator = self.search(interval)
return iterator
如您所见,我在 def search
中创建迭代器时尝试捕获错误。但是错误是在行 for repo_number, repo in self.search(interval):
上抛出的,所以在迭代器获取下一个项目的某个时刻?
我有哪些选择可以让这些错误被捕获?我最好避免将整个 for 循环包装在 try 子句中,而是在迭代过程中对其进行管理。
错误本身供参考:
File "/Users/olofjondelius/Documents/Code/laravel-ai/src/examples/migration-analysis/../../GithubScraper.py", line 47, in scrape_interval
for repo_number, repo in self.search(interval):
File "/anaconda3/envs/laravel-ai/lib/python3.7/site-packages/github/PaginatedList.py", line 58, in _iter_
newElements = self._grow()
File "/anaconda3/envs/laravel-ai/lib/python3.7/site-packages/github/PaginatedList.py", line 70, in _grow
newElements = self._fetchNextPage()
File "/anaconda3/envs/laravel-ai/lib/python3.7/site-packages/github/PaginatedList.py", line 172, in _fetchNextPage
headers=self.__headers
File "/anaconda3/envs/laravel-ai/lib/python3.7/site-packages/github/Requester.py", line 185, in requestJsonAndCheck
return self.__check(*self.requestJson(verb, url, parameters, headers, input, cnx))
File "/anaconda3/envs/laravel-ai/lib/python3.7/site-packages/github/Requester.py", line 231, in requestJson
return self.__requestEncode(cnx, verb, url, parameters, headers, input, encode)
File "/anaconda3/envs/laravel-ai/lib/python3.7/site-packages/github/Requester.py", line 284, in __requestEncode
status, responseHeaders, output = self.__requestRaw(cnx, verb, url, requestHeaders, encoded_input)
File "/anaconda3/envs/laravel-ai/lib/python3.7/site-packages/github/Requester.py", line 309, in __requestRaw
requestHeaders
File "/anaconda3/envs/laravel-ai/lib/python3.7/http/client.py", line 1229, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/anaconda3/envs/laravel-ai/lib/python3.7/http/client.py", line 1275, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/anaconda3/envs/laravel-ai/lib/python3.7/http/client.py", line 1224, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/anaconda3/envs/laravel-ai/lib/python3.7/http/client.py", line 1016, in _send_output
self.send(msg)
File "/anaconda3/envs/laravel-ai/lib/python3.7/http/client.py", line 956, in send
self.connect()
File "/anaconda3/envs/laravel-ai/lib/python3.7/http/client.py", line 1384, in connect
super().connect()
File "/anaconda3/envs/laravel-ai/lib/python3.7/http/client.py", line 928, in connect
(self.host,self.port), self.timeout, self.source_address)
File "/anaconda3/envs/laravel-ai/lib/python3.7/socket.py", line 707, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/anaconda3/envs/laravel-ai/lib/python3.7/socket.py", line 748, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
听起来异常是在迭代迭代器时引发的,而不是在创建迭代器时引发的。您当前的 try
和 except
块仅捕获您调用 self.github.search_repositories
时立即引发的异常,而不是您使用结果时出现的任何异常。
要解决这个问题,您可以将 search
函数设为生成器。这样一来,只要您有值,您就可以产生值,但仍会捕获异常并根据需要经常重试。
尝试这样的事情:
def search(self, interval):
while True:
try:
it = enumerate(self.github.search_repositories(query="Laravel created:" + interval))
yield from it
return # if we completed the yield from without an exception, we're done!
except: # you should probably limit this to catching a specific exception types
print.warning("Going to sleep for 1 hour. The search API hit the limit")
time.sleep(3600)
正如我在评论中指出的那样,您可能应该将裸露的 except
语句更改为 except socket.gaierror
或类似的东西,这样您就不会抑制 all 例外情况,而只是您所期望的例外情况,并且延迟会为您解决。仍然应该允许真正意外的事情停止程序(因为它可能反映代码中其他地方的错误)。
我正在使用 PyGithub 抓取一些存储库,尽管我在遍历搜索页面时遇到了一些错误。
def scrape_interval(self, interval):
for repo_number, repo in self.search(interval):
code...
def search(self, interval):
try:
iterator = enumerate(self.github.search_repositories(query="Laravel created:" + interval))
except:
print.warning("Going to sleep for 1 hour. The search API hit the limit")
time.sleep(3600)
iterator = self.search(interval)
return iterator
如您所见,我在 def search
中创建迭代器时尝试捕获错误。但是错误是在行 for repo_number, repo in self.search(interval):
上抛出的,所以在迭代器获取下一个项目的某个时刻?
我有哪些选择可以让这些错误被捕获?我最好避免将整个 for 循环包装在 try 子句中,而是在迭代过程中对其进行管理。
错误本身供参考:
File "/Users/olofjondelius/Documents/Code/laravel-ai/src/examples/migration-analysis/../../GithubScraper.py", line 47, in scrape_interval
for repo_number, repo in self.search(interval):
File "/anaconda3/envs/laravel-ai/lib/python3.7/site-packages/github/PaginatedList.py", line 58, in _iter_
newElements = self._grow()
File "/anaconda3/envs/laravel-ai/lib/python3.7/site-packages/github/PaginatedList.py", line 70, in _grow
newElements = self._fetchNextPage()
File "/anaconda3/envs/laravel-ai/lib/python3.7/site-packages/github/PaginatedList.py", line 172, in _fetchNextPage
headers=self.__headers
File "/anaconda3/envs/laravel-ai/lib/python3.7/site-packages/github/Requester.py", line 185, in requestJsonAndCheck
return self.__check(*self.requestJson(verb, url, parameters, headers, input, cnx))
File "/anaconda3/envs/laravel-ai/lib/python3.7/site-packages/github/Requester.py", line 231, in requestJson
return self.__requestEncode(cnx, verb, url, parameters, headers, input, encode)
File "/anaconda3/envs/laravel-ai/lib/python3.7/site-packages/github/Requester.py", line 284, in __requestEncode
status, responseHeaders, output = self.__requestRaw(cnx, verb, url, requestHeaders, encoded_input)
File "/anaconda3/envs/laravel-ai/lib/python3.7/site-packages/github/Requester.py", line 309, in __requestRaw
requestHeaders
File "/anaconda3/envs/laravel-ai/lib/python3.7/http/client.py", line 1229, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/anaconda3/envs/laravel-ai/lib/python3.7/http/client.py", line 1275, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/anaconda3/envs/laravel-ai/lib/python3.7/http/client.py", line 1224, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/anaconda3/envs/laravel-ai/lib/python3.7/http/client.py", line 1016, in _send_output
self.send(msg)
File "/anaconda3/envs/laravel-ai/lib/python3.7/http/client.py", line 956, in send
self.connect()
File "/anaconda3/envs/laravel-ai/lib/python3.7/http/client.py", line 1384, in connect
super().connect()
File "/anaconda3/envs/laravel-ai/lib/python3.7/http/client.py", line 928, in connect
(self.host,self.port), self.timeout, self.source_address)
File "/anaconda3/envs/laravel-ai/lib/python3.7/socket.py", line 707, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/anaconda3/envs/laravel-ai/lib/python3.7/socket.py", line 748, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
听起来异常是在迭代迭代器时引发的,而不是在创建迭代器时引发的。您当前的 try
和 except
块仅捕获您调用 self.github.search_repositories
时立即引发的异常,而不是您使用结果时出现的任何异常。
要解决这个问题,您可以将 search
函数设为生成器。这样一来,只要您有值,您就可以产生值,但仍会捕获异常并根据需要经常重试。
尝试这样的事情:
def search(self, interval):
while True:
try:
it = enumerate(self.github.search_repositories(query="Laravel created:" + interval))
yield from it
return # if we completed the yield from without an exception, we're done!
except: # you should probably limit this to catching a specific exception types
print.warning("Going to sleep for 1 hour. The search API hit the limit")
time.sleep(3600)
正如我在评论中指出的那样,您可能应该将裸露的 except
语句更改为 except socket.gaierror
或类似的东西,这样您就不会抑制 all 例外情况,而只是您所期望的例外情况,并且延迟会为您解决。仍然应该允许真正意外的事情停止程序(因为它可能反映代码中其他地方的错误)。