Beautifulsoup findAll，如何获取第二个文本

Question

我正在做一些 python 工作，找不到这个问题的答案，所以希望有人能提供帮助。我在 python 中使用 findAll 并以文本形式获取两个数字的输出。但是我只想要第二个数字而不是第一个。我如何定位第二个数字？

这是我的代码：

product_price_container_after = container.findAll("div",{"class":"discounted"})
product_price_after = product_price_container_after[0].text
print(product_price_after)

这是我试图从中获取它的地方：

<div class="col search_price discounted responsive_secondrow">
<span style="color: #888888;"><strike>59,98€</strike></span><br/>19,99€
                                        </div>

所以输出是：

59,98€19,99€

我怎么才得到 19,99€？

感谢您的帮助。

Answer 1

抱歉，我无法复制您的代码，它不完整。不过试试这个：

product_price_after = product_price_container_after[1].text

Answer 2

你可以使用剥离字符串

import requests
from bs4 import BeautifulSoup as bs

res = requests.get('https://store.steampowered.com/search/?specials=1&page=1')
soup = bs(res.content, 'lxml')
prices = soup.select('.discounted')

for price in prices:
    if price.text is not None:
        strings = [string for string in price.stripped_strings]
        print(strings[1])

或next_sibling:

for price in prices:
    if price.text is not None:
        print(price.find('br').next_sibling)

Answer 3

您可以使用 decompose() or extract() 方法从树中删除元素。

discountedDivs = container.findAll("div", {"class": "discounted"})

for discountedDiv in discountedDivs:
    discountedDiv.find("span").extract()
    print(discountedDiv.text) ## returns 19,99€

Beautifulsoup findAll，如何获取第二个文本

Beautifulsoup findAll, how to get the second text

python

beautifulsoup

findall

web-scraping

jupyter-notebook