bs4和请求代理脚本错误
bs4 and requests proxy script error
我目前正在编写一个从 supremenewyork.com 中提取信息的脚本
我使用的这个代理脚本以前工作 "sorta",现在它根本不起作用,因为我在我的计算机上看到这个叫做 urllib3 的东西,我认为它没用,所以我卸载了它,然后我尝试 运行我的代理脚本又出现了一个错误,说了一些关于 urllib3 的事情所以我很快重新安装了 urllib3 但我的脚本在那之后再也没有工作过......
这是我的脚本:
import requests
from bs4 import BeautifulSoup
UK_Proxy1 = input('UK http Proxy1: ')
UK_Proxy2 = input('UK http Proxy2: ')
proxies = {
'http': 'http://' + UK_Proxy1 + '',
'https': 'http://' + UK_Proxy2 + '',
}
categorys = ['jackets','shirts','tops_sweaters','sweatshirts','pants','shorts','t-shirts','hats','bags','accessories','shoes','skate']
catNumb = 0
for cat in categorys:
catStr = str(categorys[catNumb])
cUrl = 'http://www.supremenewyork.com/shop/all/' + catStr
proxy_script = requests.get(cUrl, proxies=proxies).text
bSoup = BeautifulSoup(proxy_script, 'lxml')
print('\n*******************"'+ catStr.upper() + '"*******************\n')
catNumb += 1
for item in bSoup.find_all('div', class_='inner-article'):
url = item.a['href']
alt = item.find('img')['alt']
req = requests.get('http://www.supremenewyork.com' + url)
item_soup = BeautifulSoup(req.text, 'lxml')
name = item_soup.find('h1', itemprop='name').text
style = item_soup.find('p', itemprop='model').text
print (alt + ' --- ' + name + ' --- ' + style)
当我 运行 此脚本并输入英国代理时,出现此错误
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/util/connection.py", line 60, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 745, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
在处理上述异常的过程中,又发生了一个异常:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/util/connection.py", line 60, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 745, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
在处理上述异常的过程中,又发生了一个异常:
`Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connectionpool.py", line 357, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 964, in send
self.connect()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connection.py", line 166, in connect
conn = self._new_conn()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connection.py", line 150, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x112d10eb8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known
During handling of the above exception, another exception occurred: (same error as above and continues for a bit)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/util/retry.py", line 388, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='109.108.153.29\t', port=80): Max retries exceeded with url: http://www.supremenewyork.com/shop/all/jackets (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPConnection object at 0x112d10eb8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',)))
我尝试了几种不同的代理并且 none 有效
有人可以帮我吗我真的很感激
这里是答案的基线,
它不起作用,因为未连接到代理。
你所要做的就是为它提供一个可用的代理和一个端口,如果你的计算机没有互联网连接,它会给你同样的错误,但因为你在 Whosebug 上,我会假设你有那个。
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='109.108.153.29\t', port=80): Max retries exceeded with url: http://www.supremenewyork.com/shop/all/jackets (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPConnection object at 0x112d10eb8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',)))
以下是我们如何修复它:
我们不使用有效的代理并为其提供端口,
您可以在此网站上获得代理列表,并使用 random.choice()
始终选择不同的代理。http://www.gatherproxy.com/
import requests
from bs4 import BeautifulSoup
UK_Proxy1 = '173.212.202.65:80'
# UK_Proxy2 = input('UK http Proxy2: ')
proxies = {
'http': 'http://' + UK_Proxy1,
'https': 'https://' + UK_Proxy1
}
categorys = ['jackets','shirts','tops_sweaters','sweatshirts','pants','shorts','t-shirts','hats','bags','accessories','shoes','skate']
catNumb = 0
for cat in categorys:
catStr = str(categorys[catNumb])
cUrl = 'http://www.supremenewyork.com/shop/all/' + catStr
proxy_script = requests.get(cUrl, proxies=proxies).text
bSoup = BeautifulSoup(proxy_script, 'lxml')
print('\n*******************"'+ catStr.upper() + '"*******************\n')
catNumb += 1
for item in bSoup.find_all('div', class_='inner-article'):
url = item.a['href']
alt = item.find('img')['alt']
req = requests.get('http://www.supremenewyork.com' + url)
item_soup = BeautifulSoup(req.text, 'lxml')
name = item_soup.find('h1', itemprop='name').text
style = item_soup.find('p', itemprop='model').text
print (alt + ' --- ' + name + ' --- ' + style)
我目前正在编写一个从 supremenewyork.com 中提取信息的脚本 我使用的这个代理脚本以前工作 "sorta",现在它根本不起作用,因为我在我的计算机上看到这个叫做 urllib3 的东西,我认为它没用,所以我卸载了它,然后我尝试 运行我的代理脚本又出现了一个错误,说了一些关于 urllib3 的事情所以我很快重新安装了 urllib3 但我的脚本在那之后再也没有工作过...... 这是我的脚本:
import requests
from bs4 import BeautifulSoup
UK_Proxy1 = input('UK http Proxy1: ')
UK_Proxy2 = input('UK http Proxy2: ')
proxies = {
'http': 'http://' + UK_Proxy1 + '',
'https': 'http://' + UK_Proxy2 + '',
}
categorys = ['jackets','shirts','tops_sweaters','sweatshirts','pants','shorts','t-shirts','hats','bags','accessories','shoes','skate']
catNumb = 0
for cat in categorys:
catStr = str(categorys[catNumb])
cUrl = 'http://www.supremenewyork.com/shop/all/' + catStr
proxy_script = requests.get(cUrl, proxies=proxies).text
bSoup = BeautifulSoup(proxy_script, 'lxml')
print('\n*******************"'+ catStr.upper() + '"*******************\n')
catNumb += 1
for item in bSoup.find_all('div', class_='inner-article'):
url = item.a['href']
alt = item.find('img')['alt']
req = requests.get('http://www.supremenewyork.com' + url)
item_soup = BeautifulSoup(req.text, 'lxml')
name = item_soup.find('h1', itemprop='name').text
style = item_soup.find('p', itemprop='model').text
print (alt + ' --- ' + name + ' --- ' + style)
当我 运行 此脚本并输入英国代理时,出现此错误
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/util/connection.py", line 60, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 745, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
在处理上述异常的过程中,又发生了一个异常:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/util/connection.py", line 60, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 745, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
在处理上述异常的过程中,又发生了一个异常:
`Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connectionpool.py", line 357, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 964, in send
self.connect()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connection.py", line 166, in connect
conn = self._new_conn()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connection.py", line 150, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x112d10eb8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known
During handling of the above exception, another exception occurred: (same error as above and continues for a bit)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/urllib3/util/retry.py", line 388, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='109.108.153.29\t', port=80): Max retries exceeded with url: http://www.supremenewyork.com/shop/all/jackets (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPConnection object at 0x112d10eb8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',)))
我尝试了几种不同的代理并且 none 有效 有人可以帮我吗我真的很感激
这里是答案的基线, 它不起作用,因为未连接到代理。
你所要做的就是为它提供一个可用的代理和一个端口,如果你的计算机没有互联网连接,它会给你同样的错误,但因为你在 Whosebug 上,我会假设你有那个。
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='109.108.153.29\t', port=80): Max retries exceeded with url: http://www.supremenewyork.com/shop/all/jackets (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPConnection object at 0x112d10eb8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',)))
以下是我们如何修复它:
我们不使用有效的代理并为其提供端口,
您可以在此网站上获得代理列表,并使用 random.choice()
始终选择不同的代理。http://www.gatherproxy.com/
import requests
from bs4 import BeautifulSoup
UK_Proxy1 = '173.212.202.65:80'
# UK_Proxy2 = input('UK http Proxy2: ')
proxies = {
'http': 'http://' + UK_Proxy1,
'https': 'https://' + UK_Proxy1
}
categorys = ['jackets','shirts','tops_sweaters','sweatshirts','pants','shorts','t-shirts','hats','bags','accessories','shoes','skate']
catNumb = 0
for cat in categorys:
catStr = str(categorys[catNumb])
cUrl = 'http://www.supremenewyork.com/shop/all/' + catStr
proxy_script = requests.get(cUrl, proxies=proxies).text
bSoup = BeautifulSoup(proxy_script, 'lxml')
print('\n*******************"'+ catStr.upper() + '"*******************\n')
catNumb += 1
for item in bSoup.find_all('div', class_='inner-article'):
url = item.a['href']
alt = item.find('img')['alt']
req = requests.get('http://www.supremenewyork.com' + url)
item_soup = BeautifulSoup(req.text, 'lxml')
name = item_soup.find('h1', itemprop='name').text
style = item_soup.find('p', itemprop='model').text
print (alt + ' --- ' + name + ' --- ' + style)