从变量 Python (urllib2.urlopen) + Beautifulsoup4 打开 link
opening link from variable Python (urllib2.urlopen) + Beautifulsoup4
我正在使用 Python 2.7 + urllib2 + Beautifulsoup4
当我有字符串时:
soup = BeautifulSoup(urllib2.urlopen('http://www.some-website.com', 'html'))
它工作得很好,但是当我将 URl 移动到变量时,它不起作用。
variable = 'http://www.some-website.com'
soup = BeautifulSoup(urllib2.urlopen(variable, 'html'))
错误:
edit: errcode is: File "C:\Python27\lib\urllib2.py", line 285, in get_type
raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type: api/Abc-Abc/def/7/179 –
已解决
问题是其中一个链接只是对服务器数据库的引用。
# Note: Make sure you add live website like http://vaibhavmule.com not http://some-website.com
variable = 'http://www.some-website.com' # Do not forget 'http' prefix here
# As you used 'html' which is not parser library.
soup = BeautifulSoup(urllib2.urlopen(variable))
这应该有效。
Reference 用于使用解析器库。
以下应该有效:
var='http://www.example.com'
variable = urllib2.urlopen(var).read()
from BeautifulSoup import BeautifulSoup
Soup = BeautifulSoup()
import BeautifulSoup
soup = Soup(variable)
我正在使用 Python 2.7 + urllib2 + Beautifulsoup4
当我有字符串时:
soup = BeautifulSoup(urllib2.urlopen('http://www.some-website.com', 'html'))
它工作得很好,但是当我将 URl 移动到变量时,它不起作用。
variable = 'http://www.some-website.com'
soup = BeautifulSoup(urllib2.urlopen(variable, 'html'))
错误:
edit: errcode is: File "C:\Python27\lib\urllib2.py", line 285, in get_type
raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type: api/Abc-Abc/def/7/179 –
已解决
问题是其中一个链接只是对服务器数据库的引用。
# Note: Make sure you add live website like http://vaibhavmule.com not http://some-website.com
variable = 'http://www.some-website.com' # Do not forget 'http' prefix here
# As you used 'html' which is not parser library.
soup = BeautifulSoup(urllib2.urlopen(variable))
这应该有效。
Reference 用于使用解析器库。
以下应该有效:
var='http://www.example.com'
variable = urllib2.urlopen(var).read()
from BeautifulSoup import BeautifulSoup
Soup = BeautifulSoup()
import BeautifulSoup
soup = Soup(variable)