Python HTTPS 网站的 urllib 错误
Python urllib error with HTTPS websites
我在 Windows 7 上使用 Python 3.4,我正在尝试使用 python 脚本测试代理是否允许或拒绝连接到特定网站。
我正在使用下面的代码:
from urllib.request import Request, urlopen
from urllib.error import URLError, HTTPError, urllib
conf = "http://{}:{}@{}".format(login, password, proxy)
supp = urllib.request.ProxyHandler({"http": conf})
auth = urllib.request.HTTPBasicAuthHandler()
open = urllib.request.build_opener(supp, auth, urllib.request.HTTPHandler)
urllib.request.install_opener(open)
response = urlopen(Request("http://www.google.com"))
执行上面的代码时没有出现错误,但是一旦我将 URL 切换为 HTTPS(例如,https://www.google.com),我就会收到以下错误:
C:\Python34\python.exe test_url.py
Traceback (most recent call last):
File "C:\Python34\lib\urllib\request.py", line 1182, in do_open
h.request(req.get_method(), req.selector, req.data, headers)
File "C:\Python34\lib\http\client.py", line 1088, in request
self._send_request(method, url, body, headers)
File "C:\Python34\lib\http\client.py", line 1126, in _send_request
self.endheaders(body)
File "C:\Python34\lib\http\client.py", line 1084, in endheaders
self._send_output(message_body)
File "C:\Python34\lib\http\client.py", line 922, in _send_output
self.send(msg)
File "C:\Python34\lib\http\client.py", line 857, in send
self.connect()
File "C:\Python34\lib\http\client.py", line 1223, in connect
super().connect()
File "C:\Python34\lib\http\client.py", line 834, in connect
self.timeout, self.source_address)
File "C:\Python34\lib\socket.py", line 494, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "C:\Python34\lib\socket.py", line 533, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11004] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 14, in <module>
response = urlopen(Request("https://www.google.com"))
File "C:\Python34\lib\urllib\request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "C:\Python34\lib\urllib\request.py", line 463, in open
response = self._open(req, data)
File "C:\Python34\lib\urllib\request.py", line 481, in _open
'_open', req)
File "C:\Python34\lib\urllib\request.py", line 441, in _call_chain
result = func(*args)
File "C:\Python34\lib\urllib\request.py", line 1225, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\Python34\lib\urllib\request.py", line 1184, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 11004] getaddrinfo failed>
知道为什么我的代码只适用于 HTTP 网站吗?
您需要单独指定 HTTPS 的代理处理程序,因为它是与 HTTP 不同的协议。所以 ProxyHandler
行应该改为:
supp = urllib.request.ProxyHandler({"http": conf, "https": conf})
我在 Windows 7 上使用 Python 3.4,我正在尝试使用 python 脚本测试代理是否允许或拒绝连接到特定网站。
我正在使用下面的代码:
from urllib.request import Request, urlopen
from urllib.error import URLError, HTTPError, urllib
conf = "http://{}:{}@{}".format(login, password, proxy)
supp = urllib.request.ProxyHandler({"http": conf})
auth = urllib.request.HTTPBasicAuthHandler()
open = urllib.request.build_opener(supp, auth, urllib.request.HTTPHandler)
urllib.request.install_opener(open)
response = urlopen(Request("http://www.google.com"))
执行上面的代码时没有出现错误,但是一旦我将 URL 切换为 HTTPS(例如,https://www.google.com),我就会收到以下错误:
C:\Python34\python.exe test_url.py
Traceback (most recent call last):
File "C:\Python34\lib\urllib\request.py", line 1182, in do_open
h.request(req.get_method(), req.selector, req.data, headers)
File "C:\Python34\lib\http\client.py", line 1088, in request
self._send_request(method, url, body, headers)
File "C:\Python34\lib\http\client.py", line 1126, in _send_request
self.endheaders(body)
File "C:\Python34\lib\http\client.py", line 1084, in endheaders
self._send_output(message_body)
File "C:\Python34\lib\http\client.py", line 922, in _send_output
self.send(msg)
File "C:\Python34\lib\http\client.py", line 857, in send
self.connect()
File "C:\Python34\lib\http\client.py", line 1223, in connect
super().connect()
File "C:\Python34\lib\http\client.py", line 834, in connect
self.timeout, self.source_address)
File "C:\Python34\lib\socket.py", line 494, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "C:\Python34\lib\socket.py", line 533, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11004] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 14, in <module>
response = urlopen(Request("https://www.google.com"))
File "C:\Python34\lib\urllib\request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "C:\Python34\lib\urllib\request.py", line 463, in open
response = self._open(req, data)
File "C:\Python34\lib\urllib\request.py", line 481, in _open
'_open', req)
File "C:\Python34\lib\urllib\request.py", line 441, in _call_chain
result = func(*args)
File "C:\Python34\lib\urllib\request.py", line 1225, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\Python34\lib\urllib\request.py", line 1184, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 11004] getaddrinfo failed>
知道为什么我的代码只适用于 HTTP 网站吗?
您需要单独指定 HTTPS 的代理处理程序,因为它是与 HTTP 不同的协议。所以 ProxyHandler
行应该改为:
supp = urllib.request.ProxyHandler({"http": conf, "https": conf})