如何使用 python beautifulsoup 识别 <p> 标签是否在 <table> 标签内?
How to identify if a <p> tag is inside a <table> tag using python beautifulsoup?
如何查找 table 标签内部和外部的 p 标签?
<p> word outside p tag inside table tag </p>
<table>
<tbody>
<tr>
<td>
<p> word inside p tag inside table tag </p>
</td>
</tr>
</tbody>
</table>
您可以使用 CSS 选择器:
from bs4 import BeautifulSoup
html_doc = """
<p> word outside p tag inside table tag </p>
<table>
<tbody>
<tr>
<td>
<p> word inside p tag inside table tag </p>
</td>
</tr>
</tbody>
</table>
"""
soup = BeautifulSoup(html_doc, "html.parser")
for p in soup.select("table p"):
print(p.text)
打印:
word inside p tag inside table tag
或使用 bs4
API:
for table in soup.find_all("table"):
for p in table.find_all("p"):
print(p.text)
如何查找 table 标签内部和外部的 p 标签?
<p> word outside p tag inside table tag </p>
<table>
<tbody>
<tr>
<td>
<p> word inside p tag inside table tag </p>
</td>
</tr>
</tbody>
</table>
您可以使用 CSS 选择器:
from bs4 import BeautifulSoup
html_doc = """
<p> word outside p tag inside table tag </p>
<table>
<tbody>
<tr>
<td>
<p> word inside p tag inside table tag </p>
</td>
</tr>
</tbody>
</table>
"""
soup = BeautifulSoup(html_doc, "html.parser")
for p in soup.select("table p"):
print(p.text)
打印:
word inside p tag inside table tag
或使用 bs4
API:
for table in soup.find_all("table"):
for p in table.find_all("p"):
print(p.text)