Beautiful Soup,get_text 但不是 <span> 文本。我怎样才能得到它?
Beautiful Soup, get_text but NOT the <span> text.. How can i get it?
鉴于此标记:
[标记][1]
我需要在一列中获取数字 182,在另一列中获取 58。我 已经有了 span,但是当我调用 div.get_tex() 或 string 时,它 returns = 18258(两个数字)
这是我的代码_:
prices= soup.find_all('div', class_='grilla-producto-precio')
cents= []
price= []
for px in prices:
### here i need to get the number 182 and append it to "price"
for spn in px.find('span'):
cents.append(spn)
如何在没有SPAN的情况下单独获得价格182?谢谢!!!!
[1]: https://i.stack.imgur.com/ld9qo.png
您问题的答案与this question的答案几乎相同。
from bs4 import BeautifulSoup
html = """
<div class = "grilla-producto-precio">
" $"
"182"
<span>58</span>
</div>
"""
soup = BeautifulSoup(html,'html5lib')
prices = soup.find_all('div',class_ = "grilla-producto-precio")
cents = []
for px in prices:
txt = px.find_next(text=True).strip()
txt = txt.replace('"','')
txt = int(txt.split("\n")[-1])
cents.append(txt)
输出:
[182]
另一个解决方案是检查字符串 isdigit():
from bs4 import BeautifulSoup
txt = """
<div class = "grilla-producto-precio">
" $"
"182"
<span>58</span>
</div>
"""
soup = BeautifulSoup(txt, "html.parser")
data = soup.find("div", class_="grilla-producto-precio").next
price = [int("".join(d for d in data if d.isdigit()))]
print(price) # Output: [182]
鉴于此标记: [标记][1]
我需要在一列中获取数字 182,在另一列中获取 58。我 已经有了 span,但是当我调用 div.get_tex() 或 string 时,它 returns = 18258(两个数字)
这是我的代码_:
prices= soup.find_all('div', class_='grilla-producto-precio')
cents= []
price= []
for px in prices:
### here i need to get the number 182 and append it to "price"
for spn in px.find('span'):
cents.append(spn)
如何在没有SPAN的情况下单独获得价格182?谢谢!!!! [1]: https://i.stack.imgur.com/ld9qo.png
您问题的答案与this question的答案几乎相同。
from bs4 import BeautifulSoup
html = """
<div class = "grilla-producto-precio">
" $"
"182"
<span>58</span>
</div>
"""
soup = BeautifulSoup(html,'html5lib')
prices = soup.find_all('div',class_ = "grilla-producto-precio")
cents = []
for px in prices:
txt = px.find_next(text=True).strip()
txt = txt.replace('"','')
txt = int(txt.split("\n")[-1])
cents.append(txt)
输出:
[182]
另一个解决方案是检查字符串 isdigit():
from bs4 import BeautifulSoup
txt = """
<div class = "grilla-producto-precio">
" $"
"182"
<span>58</span>
</div>
"""
soup = BeautifulSoup(txt, "html.parser")
data = soup.find("div", class_="grilla-producto-precio").next
price = [int("".join(d for d in data if d.isdigit()))]
print(price) # Output: [182]