使用 Beautiful Soup 获取 class 内的内容
getting contents within class using Beautiful Soup
我一直在尝试使用网站上的 Beautiful Soup 获取某件商品的价格(4.99 美元),我能够检索到以下信息:
tag= soup.findAll("div", class_='prod-PriceHero')
[<div class="prod-PriceHero"><span class="hide-content display-inline-block-m"><span class="display-inline-block arrange-fit Price Price--stylized u-textColor" data-tl-id="Price-ProductOffer"><span><span class="Price-currency" content="USD" itemprop="priceCurrency">$</span><span class="Price-characteristic" content="4.99" itemprop="price">4</span><span class="Price-mark">.</span><span class="Price-mantissa">99</span></span></span></span><span class="hide-content-m"><span class="display-inline-block arrange-fit Price u-textColor" data-tl-id="Price-ProductOffer"><span><span class="Price-currency">$</span><span class="Price-characteristic">4</span><span class="Price-mark">.</span><span class="Price-mantissa">99</span></span></span></span></div>]
从这里开始,我尝试使用以下代码:
soup=bs(tag,'lxml')
txt=soup.get_text()
print(txt)
但出现以下错误:TypeError: expected string or bytes-like object.
有什么简单的方法可以从中提取 4.99 美元的价值吗?谢谢你的时间。
您可以通过 data-tl-id
属性找到跨度并通过 .text
获取其下的所有文本
spans = soup.find_all(attrs={"data-tl-id":"Price-ProductOffer"})
[span.text for span in spans]
输出:
['.99', '.99']
我一直在尝试使用网站上的 Beautiful Soup 获取某件商品的价格(4.99 美元),我能够检索到以下信息:
tag= soup.findAll("div", class_='prod-PriceHero')
[<div class="prod-PriceHero"><span class="hide-content display-inline-block-m"><span class="display-inline-block arrange-fit Price Price--stylized u-textColor" data-tl-id="Price-ProductOffer"><span><span class="Price-currency" content="USD" itemprop="priceCurrency">$</span><span class="Price-characteristic" content="4.99" itemprop="price">4</span><span class="Price-mark">.</span><span class="Price-mantissa">99</span></span></span></span><span class="hide-content-m"><span class="display-inline-block arrange-fit Price u-textColor" data-tl-id="Price-ProductOffer"><span><span class="Price-currency">$</span><span class="Price-characteristic">4</span><span class="Price-mark">.</span><span class="Price-mantissa">99</span></span></span></span></div>]
从这里开始,我尝试使用以下代码:
soup=bs(tag,'lxml')
txt=soup.get_text()
print(txt)
但出现以下错误:TypeError: expected string or bytes-like object.
有什么简单的方法可以从中提取 4.99 美元的价值吗?谢谢你的时间。
您可以通过 data-tl-id
属性找到跨度并通过 .text
spans = soup.find_all(attrs={"data-tl-id":"Price-ProductOffer"})
[span.text for span in spans]
输出:
['.99', '.99']