我们可以使用 urllib 或 urllib2 或 requests 或 mechanize 在 python 中重新加载 page/url 吗?
Can we reload a page/url in python using urllib or urllib2 or requests or mechanize?
我正在尝试打开 page/link 并捕获其中的内容。
它有时会给我所需的内容,有时会抛出错误。
我看到如果我刷新页面几次 - 我就得到了内容。
所以,我想重新加载页面并捕捉它。
这是我的伪代码:
attempts = 0
while attempts:
try:
open_page = urllib2.Request(www.xyz.com)
# Or I think we can also do urllib2.urlopen(www.xyz.com)
break
except:
# here I want to refresh/reload the page
attempts += 1
我的问题是:
1. 如何使用 urllib 或 urllib2 或 requests 或 mechanize 重新加载页面?
2. 我们可以这样循环 try catch 吗?
谢谢!
如果您在尝试次数等于 0 时执行 while attempts
,您将永远不会启动循环。我会向后做,初始化 attempts
等于你想要的重新加载次数:
attempts = 10
while attempts:
try:
open_page = urllib2.Request('www.xyz.com')
except:
attempts -= 1
else:
attempts = False
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
attempts = 10
retries = Retry(total=attempts,
backoff_factor=0.1,
status_forcelist=[ 500, 502, 503, 504 ])
sess = requests.Session()
sess.mount('http://', HTTPAdapter(max_retries=retries ))
sess.mount('https://', HTTPAdapter(max_retries=retries))
sess.get('http://www.google.co.nz/')
follow函数可以在出现异常或http响应状态码不是200后刷新。
def retrieve(url):
while 1:
try:
response = requests.get(url)
if response.ok:
return response
else:
print(response.status)
time.sleep(3)
continue
except:
print(traceback.format_exc())
time.sleep(3)
continue
我正在尝试打开 page/link 并捕获其中的内容。 它有时会给我所需的内容,有时会抛出错误。 我看到如果我刷新页面几次 - 我就得到了内容。
所以,我想重新加载页面并捕捉它。
这是我的伪代码:
attempts = 0
while attempts:
try:
open_page = urllib2.Request(www.xyz.com)
# Or I think we can also do urllib2.urlopen(www.xyz.com)
break
except:
# here I want to refresh/reload the page
attempts += 1
我的问题是:
1. 如何使用 urllib 或 urllib2 或 requests 或 mechanize 重新加载页面?
2. 我们可以这样循环 try catch 吗?
谢谢!
如果您在尝试次数等于 0 时执行 while attempts
,您将永远不会启动循环。我会向后做,初始化 attempts
等于你想要的重新加载次数:
attempts = 10
while attempts:
try:
open_page = urllib2.Request('www.xyz.com')
except:
attempts -= 1
else:
attempts = False
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
attempts = 10
retries = Retry(total=attempts,
backoff_factor=0.1,
status_forcelist=[ 500, 502, 503, 504 ])
sess = requests.Session()
sess.mount('http://', HTTPAdapter(max_retries=retries ))
sess.mount('https://', HTTPAdapter(max_retries=retries))
sess.get('http://www.google.co.nz/')
follow函数可以在出现异常或http响应状态码不是200后刷新。
def retrieve(url):
while 1:
try:
response = requests.get(url)
if response.ok:
return response
else:
print(response.status)
time.sleep(3)
continue
except:
print(traceback.format_exc())
time.sleep(3)
continue