Python & BeautifulSoup - 我可以重复使用函数 findAll 吗?
Python & BeautifulSoup - can I use the function findAll repeatedly?
在 BS 文档中,他们写道:
Remember the soup.head.title trick from Navigating using tag names? That trick works by repeatedly calling find():
soup.head.title
# <title>The Dormouse's story</title>
soup.find("head").find("title")
# <title>The Dormouse's story</title>
我可以对 findAll 做同样的事情吗?不能让它工作...
不,你不能链接 findAll 因为它 returns 一个 bs4.element.ResultSet
基本上是没有 findAll
方法的列表。如果你尝试过,你会得到一个明显的错误。
一个bs4.element.ResultSet
的属性远少于一个bs4.element.Tag
,其中大部分只是常规列表方法:
fn = soup.findAll("title")
fn.append fn.copy fn.extend fn.insert fn.remove fn.sort
fn.clear fn.count fn.index fn.pop fn.reverse fn.source
.find/bs4.element.Tag, 属性:
In [25]: f = soup.find("title")
In [26]: f.
Display all 100 possibilities? (y or n)
f.HTML_FORMATTERS f.has_attr
f.XML_FORMATTERS f.has_key
f.append f.hidden
f.attribselect_re f.index
f.attrs f.insert
f.can_be_empty_element f.insert_after
f.childGenerator f.insert_before
f.children f.isSelfClosing
f.clear f.is_empty_element
f.contents f.name
f.decode f.namespace
f.decode_contents f.next
f.decompose f.nextGenerator
f.descendants f.nextSibling
f.encode f.nextSiblingGenerator
f.encode_contents f.next_element
f.extract f.next_elements
f.fetchNextSiblings f.next_sibling
f.fetchParents f.next_siblings
f.fetchPrevious f.parent
f.fetchPreviousSiblings f.parentGenerator
f.find f.parents
f.findAll f.parserClass
f.findAllNext f.parser_class
f.findAllPrevious f.prefix
f.findChild f.prettify
f.findChildren f.previous
f.findNext f.previousGenerator
f.findNextSibling f.previousSibling
f.findNextSiblings f.previousSiblingGenerator
f.findParent f.previous_element
f.findParents f.previous_elements
f.findPrevious f.previous_sibling
f.findPreviousSibling f.previous_siblings
f.findPreviousSiblings f.recursiveChildGenerator
f.find_all f.renderContents
f.find_all_next f.replaceWith
f.find_all_previous f.replaceWithChildren
f.find_next f.replace_with
f.find_next_sibling f.replace_with_children
f.find_next_siblings f.select
f.find_parent f.select_one
f.find_parents f.setup
f.find_previous f.string
f.find_previous_sibling f.strings
f.find_previous_siblings f.stripped_strings
f.format_string f.tag_name_re
f.get f.text
f.getText f.unwrap
f.get_text f.wrap
如果所有搜索都给出一个结果,或者您知道确切的索引,您可以像这样链接它:
listItems = soup.findAll("div", { "class": "table-wrap" })[0] \
.findAll("table")[0] \
.findAll("tr")
将继续
<div class="table-wrap">
<table>
<tr>
...
在 BS 文档中,他们写道:
Remember the soup.head.title trick from Navigating using tag names? That trick works by repeatedly calling find():
soup.head.title
# <title>The Dormouse's story</title>
soup.find("head").find("title")
# <title>The Dormouse's story</title>
我可以对 findAll 做同样的事情吗?不能让它工作...
不,你不能链接 findAll 因为它 returns 一个 bs4.element.ResultSet
基本上是没有 findAll
方法的列表。如果你尝试过,你会得到一个明显的错误。
一个bs4.element.ResultSet
的属性远少于一个bs4.element.Tag
,其中大部分只是常规列表方法:
fn = soup.findAll("title")
fn.append fn.copy fn.extend fn.insert fn.remove fn.sort
fn.clear fn.count fn.index fn.pop fn.reverse fn.source
.find/bs4.element.Tag, 属性:
In [25]: f = soup.find("title")
In [26]: f.
Display all 100 possibilities? (y or n)
f.HTML_FORMATTERS f.has_attr
f.XML_FORMATTERS f.has_key
f.append f.hidden
f.attribselect_re f.index
f.attrs f.insert
f.can_be_empty_element f.insert_after
f.childGenerator f.insert_before
f.children f.isSelfClosing
f.clear f.is_empty_element
f.contents f.name
f.decode f.namespace
f.decode_contents f.next
f.decompose f.nextGenerator
f.descendants f.nextSibling
f.encode f.nextSiblingGenerator
f.encode_contents f.next_element
f.extract f.next_elements
f.fetchNextSiblings f.next_sibling
f.fetchParents f.next_siblings
f.fetchPrevious f.parent
f.fetchPreviousSiblings f.parentGenerator
f.find f.parents
f.findAll f.parserClass
f.findAllNext f.parser_class
f.findAllPrevious f.prefix
f.findChild f.prettify
f.findChildren f.previous
f.findNext f.previousGenerator
f.findNextSibling f.previousSibling
f.findNextSiblings f.previousSiblingGenerator
f.findParent f.previous_element
f.findParents f.previous_elements
f.findPrevious f.previous_sibling
f.findPreviousSibling f.previous_siblings
f.findPreviousSiblings f.recursiveChildGenerator
f.find_all f.renderContents
f.find_all_next f.replaceWith
f.find_all_previous f.replaceWithChildren
f.find_next f.replace_with
f.find_next_sibling f.replace_with_children
f.find_next_siblings f.select
f.find_parent f.select_one
f.find_parents f.setup
f.find_previous f.string
f.find_previous_sibling f.strings
f.find_previous_siblings f.stripped_strings
f.format_string f.tag_name_re
f.get f.text
f.getText f.unwrap
f.get_text f.wrap
如果所有搜索都给出一个结果,或者您知道确切的索引,您可以像这样链接它:
listItems = soup.findAll("div", { "class": "table-wrap" })[0] \
.findAll("table")[0] \
.findAll("tr")
将继续
<div class="table-wrap">
<table>
<tr>
...