AttributeError: 'ResultSet' object has no attribute 'find_all'
AttributeError: 'ResultSet' object has no attribute 'find_all'
哪里出错了?我想解析没有标签的文本。
from bs4 import BeautifulSoup
import re
import urllib.request
f = urllib.request.urlopen("http://www.championat.com/football/news-2442480-orlov-zenit-obespokoen---pole-na-novom-stadione-mozhet-byt-nekachestvennym.html")
soup = BeautifulSoup(f, 'html.parser')
soup=soup.find_all('div', class_="text-decor article__contain")
invalid_tags = ['b', 'i', 'u', 'br', 'a']
for tag in invalid_tags:
for match in soup.find_all(tag):
match.replaceWithChildren()
soup = ''.join(map(str, soup.contents))
print (soup)
错误:
Traceback (most recent call last):
File "1.py", line 9, in <module>
for match in soup.find_all(tag):
AttributeError: 'ResultSet' object has no attribute 'find_all'
soup=soup.find_all('div', class_="text-decor article__contain")
在这条线上 soup
变成了一个 ResultSet
实例 - 基本上是一个 Tag
实例列表 。而且,您将获得 'ResultSet' object has no attribute 'find_all'
,因为此 ResultSet
实例没有 find_all()
方法。仅供参考,这个问题实际上在文档的 troubleshooting section 中有描述:
AttributeError: 'ResultSet' object has no attribute 'foo'
- This
usually happens because you expected find_all()
to return a single tag
or string. But find_all()
returns a list of tags and strings–a
ResultSet object. You need to iterate over the list and look at the
.foo of each one. Or, if you really only want one result, you need to
use find()
instead of find_all()
.
你真的想要一个结果,因为页面上只有一篇文章:
soup = soup.find('div', class_="text-decor article__contain")
请注意,虽然不需要一个一个地查找标签,但您可以将标签名称列表直接传递给 find_all()
- BeautifulSoup
在定位元素方面非常灵活:
article = soup.find('div', class_="text-decor article__contain")
invalid_tags = ['b', 'i', 'u', 'br', 'a']
for match in article.find_all(invalid_tags):
match.unwrap() # bs4 alternative for replaceWithChildren
哪里出错了?我想解析没有标签的文本。
from bs4 import BeautifulSoup
import re
import urllib.request
f = urllib.request.urlopen("http://www.championat.com/football/news-2442480-orlov-zenit-obespokoen---pole-na-novom-stadione-mozhet-byt-nekachestvennym.html")
soup = BeautifulSoup(f, 'html.parser')
soup=soup.find_all('div', class_="text-decor article__contain")
invalid_tags = ['b', 'i', 'u', 'br', 'a']
for tag in invalid_tags:
for match in soup.find_all(tag):
match.replaceWithChildren()
soup = ''.join(map(str, soup.contents))
print (soup)
错误:
Traceback (most recent call last):
File "1.py", line 9, in <module>
for match in soup.find_all(tag):
AttributeError: 'ResultSet' object has no attribute 'find_all'
soup=soup.find_all('div', class_="text-decor article__contain")
在这条线上 soup
变成了一个 ResultSet
实例 - 基本上是一个 Tag
实例列表 。而且,您将获得 'ResultSet' object has no attribute 'find_all'
,因为此 ResultSet
实例没有 find_all()
方法。仅供参考,这个问题实际上在文档的 troubleshooting section 中有描述:
AttributeError: 'ResultSet' object has no attribute 'foo'
- This usually happens because you expectedfind_all()
to return a single tag or string. Butfind_all()
returns a list of tags and strings–a ResultSet object. You need to iterate over the list and look at the .foo of each one. Or, if you really only want one result, you need to usefind()
instead offind_all()
.
你真的想要一个结果,因为页面上只有一篇文章:
soup = soup.find('div', class_="text-decor article__contain")
请注意,虽然不需要一个一个地查找标签,但您可以将标签名称列表直接传递给 find_all()
- BeautifulSoup
在定位元素方面非常灵活:
article = soup.find('div', class_="text-decor article__contain")
invalid_tags = ['b', 'i', 'u', 'br', 'a']
for match in article.find_all(invalid_tags):
match.unwrap() # bs4 alternative for replaceWithChildren