Python 3 中不允许使用 urllib2 的方法
Method Not Allowed In Python 3 using urllib2
from bs4 import BeautifulSoup
import urllib.request as urllib2
url="http://www.scmp.com/news/world"
page = urllib2.urlopen(url)
soup = BeautifulSoup(page, "html.parser")
item = soup.find_all("h3", _class="node-title lvl_24-title")
print(item)
This code Give an Method not allowed only on this url 错误,相同的代码对我正在尝试的大多数 urls 都能正常工作。
下面是完整的错误信息
Traceback (most recent call last):
File "E:/Scrappers/test11.py", line 6, in <module>
page = urllib2.urlopen(url)
File "C:\Program Files (x86)\Python36-32\lib\urllib\request.py", line
223, in urlopen
return opener.open(url, data, timeout)
File "C:\Program Files (x86)\Python36-32\lib\urllib\request.py", line
532, in open
response = meth(req, response)
File "C:\Program Files (x86)\Python36-32\lib\urllib\request.py", line
642, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Program Files (x86)\Python36-32\lib\urllib\request.py", line
570, in error
return self._call_chain(*args)
File "C:\Program Files (x86)\Python36-32\lib\urllib\request.py", line
504, in _call_chain
result = func(*args)
File "C:\Program Files (x86)\Python36-32\lib\urllib\request.py", line
650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 405: Method Not Allowed
此问题可能与 重复。
由于您的 urlopen 请求中未指定 User-Agent,您已被检测为机器人。我可以推荐不那么痛苦的 "requests" 库吗?
import requests
from bs4 import BeautifulSoup
#Specify some headers. urlopen uses "Python-urllib" as a header, which makes you seem like a bot.
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36'}
url = 'http://www.scmp.com/news/world'
page = requests.get(url,headers=headers)
soup = BeautifulSoup(page.content,'lxml')
瞧!你有一些汤可以玩。
from bs4 import BeautifulSoup
import urllib.request as urllib2
url="http://www.scmp.com/news/world"
page = urllib2.urlopen(url)
soup = BeautifulSoup(page, "html.parser")
item = soup.find_all("h3", _class="node-title lvl_24-title")
print(item)
This code Give an Method not allowed only on this url 错误,相同的代码对我正在尝试的大多数 urls 都能正常工作。 下面是完整的错误信息
Traceback (most recent call last):
File "E:/Scrappers/test11.py", line 6, in <module>
page = urllib2.urlopen(url)
File "C:\Program Files (x86)\Python36-32\lib\urllib\request.py", line
223, in urlopen
return opener.open(url, data, timeout)
File "C:\Program Files (x86)\Python36-32\lib\urllib\request.py", line
532, in open
response = meth(req, response)
File "C:\Program Files (x86)\Python36-32\lib\urllib\request.py", line
642, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Program Files (x86)\Python36-32\lib\urllib\request.py", line
570, in error
return self._call_chain(*args)
File "C:\Program Files (x86)\Python36-32\lib\urllib\request.py", line
504, in _call_chain
result = func(*args)
File "C:\Program Files (x86)\Python36-32\lib\urllib\request.py", line
650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 405: Method Not Allowed
此问题可能与 重复。
由于您的 urlopen 请求中未指定 User-Agent,您已被检测为机器人。我可以推荐不那么痛苦的 "requests" 库吗?
import requests
from bs4 import BeautifulSoup
#Specify some headers. urlopen uses "Python-urllib" as a header, which makes you seem like a bot.
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36'}
url = 'http://www.scmp.com/news/world'
page = requests.get(url,headers=headers)
soup = BeautifulSoup(page.content,'lxml')
瞧!你有一些汤可以玩。