Beautiful Soup - 从 <ul> 中的所有 <li> 元素获取文本
Beautiful Soup - get text from all <li> elements in <ul>
使用此代码:
match_url = f'https://interativos.globoesporte.globo.com/cartola-fc/mais-escalados/mais-escalados-do-cartola-fc'
browser.visit(match_url)
browser.find_by_tag('li[class="historico-rodadas__rodada historico-rodadas__rodada--ativa"]').click()
soup = BeautifulSoup(browser.html, 'html.parser')
innerContent = soup.findAll('ul',class_="field__players")
print (innerContent)
我已经设法获取 <ul>
:
[<ul class="field__players"><li class="player"...]
现在如何为列表中的所有玩家访问 player__name
和 player__value
的文本?
这应该对你有帮助:
from selenium import webdriver
from bs4 import BeautifulSoup
driver = webdriver.Chrome()
driver.get('https://interativos.globoesporte.globo.com/cartola-fc/mais-escalados/mais-escalados-do-cartola-fc')
src = driver.page_source
driver.close()
soup = BeautifulSoup(src,'html5lib')
innerContent = soup.find('ul',class_="field__players")
li_items = innerContent.find_all('li')
for li in li_items:
p_tags = li.find_all('p')[:-1] #The [:-1] removes the last p tag from the list, which is player__label
for p in p_tags:
print(p.text)
输出:
Keno
2.868.755
Pedro
2.483.069
Bruno Henrique
1.686.894
Hugo Souza
809.186
Guilherme Arana
1.314.769
Filipe Luís
776.147
Thiago Galhardo
2.696.853
Vinícius
1.405.012
Nenê
1.369.209
Jorge Sampaoli
1.255.731
Réver
1.505.522
Víctor Cuesta
1.220.451
我应该把这个放在这里让你看看他想要什么。
soup = BeautifulSoup(browser.html, 'html.parser')
innerContent = soup.findAll('ul',class_="field__players")
for li in innerContent.findAll('li'):
player_name = li.find('p', class_ = "player__name")
player_value = li.find('p', class_ = "player__value")
print(player_name.text)
print(player_value.text)
使用此代码:
match_url = f'https://interativos.globoesporte.globo.com/cartola-fc/mais-escalados/mais-escalados-do-cartola-fc'
browser.visit(match_url)
browser.find_by_tag('li[class="historico-rodadas__rodada historico-rodadas__rodada--ativa"]').click()
soup = BeautifulSoup(browser.html, 'html.parser')
innerContent = soup.findAll('ul',class_="field__players")
print (innerContent)
我已经设法获取 <ul>
:
[<ul class="field__players"><li class="player"...]
现在如何为列表中的所有玩家访问 player__name
和 player__value
的文本?
这应该对你有帮助:
from selenium import webdriver
from bs4 import BeautifulSoup
driver = webdriver.Chrome()
driver.get('https://interativos.globoesporte.globo.com/cartola-fc/mais-escalados/mais-escalados-do-cartola-fc')
src = driver.page_source
driver.close()
soup = BeautifulSoup(src,'html5lib')
innerContent = soup.find('ul',class_="field__players")
li_items = innerContent.find_all('li')
for li in li_items:
p_tags = li.find_all('p')[:-1] #The [:-1] removes the last p tag from the list, which is player__label
for p in p_tags:
print(p.text)
输出:
Keno
2.868.755
Pedro
2.483.069
Bruno Henrique
1.686.894
Hugo Souza
809.186
Guilherme Arana
1.314.769
Filipe Luís
776.147
Thiago Galhardo
2.696.853
Vinícius
1.405.012
Nenê
1.369.209
Jorge Sampaoli
1.255.731
Réver
1.505.522
Víctor Cuesta
1.220.451
我应该把这个放在这里让你看看他想要什么。
soup = BeautifulSoup(browser.html, 'html.parser')
innerContent = soup.findAll('ul',class_="field__players")
for li in innerContent.findAll('li'):
player_name = li.find('p', class_ = "player__name")
player_value = li.find('p', class_ = "player__value")
print(player_name.text)
print(player_value.text)