Python BeautifulSoup: 排除 in select 语句中的其他标签
Python BeautifulSoup: Excluding other tags in in select statement
我在使用 BeautifulSoup 选择文本时遇到问题。我试图仅从 <span class= "data">
获取文本,但我也不断从其他元素获取结果。比如下面代码中我要的字是'Playstation 3'和'Game Boy Advance',不是'PC' Could you help?
汤:
<span class="data">
PlayStation 3
</span>,
<span class="data">
Game Boy Advance
</span>,
<span class="data">
Dec 8, 2022
</span>,
<span class="data">
<a href="/game/pc">
PC
</a>
P.S。我在下面试过这个
代码:
console = soup.select('span.data')
for console in console:
print(console.get_text(strip = True))
输出片段:
PlayStation 3
Game Boy Advance
Dec 8, 2022
PC
谢谢!
此示例将 select 所有 <span class="data">
其中没有任何其他标签:
from bs4 import BeautifulSoup
html_doc = """\
<span class="data">
PlayStation 3
</span>,
<span class="data">
Game Boy Advance
</span>,
<span class="data">
Dec 8, 2022
</span>,
<span class="data">
<a href="/game/pc">
PC
</a>
"""
soup = BeautifulSoup(html_doc, "html.parser")
for span in soup.select("span.data:not(:has(*))"):
print(span.get_text(strip=True))
打印:
PlayStation 3
Game Boy Advance
Dec 8, 2022
我在使用 BeautifulSoup 选择文本时遇到问题。我试图仅从 <span class= "data">
获取文本,但我也不断从其他元素获取结果。比如下面代码中我要的字是'Playstation 3'和'Game Boy Advance',不是'PC' Could you help?
汤:
<span class="data">
PlayStation 3
</span>,
<span class="data">
Game Boy Advance
</span>,
<span class="data">
Dec 8, 2022
</span>,
<span class="data">
<a href="/game/pc">
PC
</a>
P.S。我在下面试过这个 代码:
console = soup.select('span.data')
for console in console:
print(console.get_text(strip = True))
输出片段:
PlayStation 3
Game Boy Advance
Dec 8, 2022
PC
谢谢!
此示例将 select 所有 <span class="data">
其中没有任何其他标签:
from bs4 import BeautifulSoup
html_doc = """\
<span class="data">
PlayStation 3
</span>,
<span class="data">
Game Boy Advance
</span>,
<span class="data">
Dec 8, 2022
</span>,
<span class="data">
<a href="/game/pc">
PC
</a>
"""
soup = BeautifulSoup(html_doc, "html.parser")
for span in soup.select("span.data:not(:has(*))"):
print(span.get_text(strip=True))
打印:
PlayStation 3
Game Boy Advance
Dec 8, 2022