BeautifulSoup Python 文档中给出的示例不起作用
BeautifulSoup example given in Python documents not working
我正在尝试 BeautifulSoup 文档中给出的示例,其中一个示例未给出预期结果
html_doc = """
<html><head><title>The Dormouse's story</title></head>
<p class="title"><b>The Dormouse's story</b></p>
<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
<p class="story">...</p>
"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc)
在示例中它说
soup.find_all('b')
# [<b>The Dormouse's story</b>]
但是当我尝试相同的命令时,出现如下错误
>>> soup.find_all('b')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not callable
但是 soup 对象不是 None
>>> soup
<html><head><title>The Dormouse's story</title></head>
<p class="title"><b>The Dormouse's story</b></p>
<p class="story">Once upon a time there were three little sisters; and their
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
<p class="story">...</p>
</html>
我不确定为什么给定的示例不起作用。
您正在使用 BeautifulSoup 版本 three,而不是版本 4。
在BeautifulSoup3中,调用的方法是findAll()
,而不是find_all()
。因为使用一个不被识别的属性被翻译成soup.find('unrecognized_attribute')
,你让BeautifulSoup找你第一个<find_all>
HTML元素 ,不存在,因此返回 None
。
改用BeautifulSoup 4:
from bs4 import BeautifulSoup
您几乎肯定会使用的地方:
from BeautifulSoup import BeautifulSoup # version 3
您需要安装 beautifulsoup4
项目。
演示:
>>> html_doc = """
... <html><head><title>The Dormouse's story</title></head>
...
... <p class="title"><b>The Dormouse's story</b></p>
...
... <p class="story">Once upon a time there were three little sisters; and their names were
... <a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
... <a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
... <a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
... <p class="story">...</p>
... """
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(html_doc)
>>> soup.find_all('b')
[<b>The Dormouse's story</b>]
>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup(html_doc)
>>> soup.find_all('b')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not callable
我正在尝试 BeautifulSoup 文档中给出的示例,其中一个示例未给出预期结果
html_doc = """
<html><head><title>The Dormouse's story</title></head>
<p class="title"><b>The Dormouse's story</b></p>
<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
<p class="story">...</p>
"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc)
在示例中它说
soup.find_all('b')
# [<b>The Dormouse's story</b>]
但是当我尝试相同的命令时,出现如下错误
>>> soup.find_all('b')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not callable
但是 soup 对象不是 None
>>> soup
<html><head><title>The Dormouse's story</title></head>
<p class="title"><b>The Dormouse's story</b></p>
<p class="story">Once upon a time there were three little sisters; and their
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
<p class="story">...</p>
</html>
我不确定为什么给定的示例不起作用。
您正在使用 BeautifulSoup 版本 three,而不是版本 4。
在BeautifulSoup3中,调用的方法是findAll()
,而不是find_all()
。因为使用一个不被识别的属性被翻译成soup.find('unrecognized_attribute')
,你让BeautifulSoup找你第一个<find_all>
HTML元素 ,不存在,因此返回 None
。
改用BeautifulSoup 4:
from bs4 import BeautifulSoup
您几乎肯定会使用的地方:
from BeautifulSoup import BeautifulSoup # version 3
您需要安装 beautifulsoup4
项目。
演示:
>>> html_doc = """
... <html><head><title>The Dormouse's story</title></head>
...
... <p class="title"><b>The Dormouse's story</b></p>
...
... <p class="story">Once upon a time there were three little sisters; and their names were
... <a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
... <a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
... <a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
... <p class="story">...</p>
... """
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(html_doc)
>>> soup.find_all('b')
[<b>The Dormouse's story</b>]
>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup(html_doc)
>>> soup.find_all('b')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not callable