BeautifulSoup 提取多个 table
BeautifulSoup extracting multiple table
我正在尝试从同一个 HTML 和 BeautifulSoup 的两个表中提取一些数据。实际上,我已经从两个表中提取了部分但不是全部。这是我的代码:
from urllib.request import urlopen
from bs4 import BeautifulSoup
html_content = urlopen('https://www.icewarehouse.com/Bauer_Vapor_X25_Ice_Hockey_Skates/descpage-V25XS.html')
soup = BeautifulSoup(html_content, "lxml")
tables = soup.find_all('table', attrs={'class' : 'orderingtable fl'})
for table_skates in tables:
t_headers = []
t_data = []
t_row = {}
for tr in table_skates.find_all('th'):
t_headers.append(tr.text.replace('\n', '').strip())
for td in table_skates.find_all('td'):
t_data.append(td.text.replace('\n', '').strip())
t_row = dict(zip(t_headers, t_data))
print(t_row)
这是我得到的输出:
{'Size': '1.0', 'Price': '9.99', 'Stock': '1', 'Qty': ''}
{'Size': '7.0', 'Price': '9.99', 'Stock': '2+', 'Qty': ''}
在'pandas'中使用'read_html'即可轻松获取。
df = pd.read_html(html_content, attrs={'class' : 'orderingtable fl'})
我正在尝试从同一个 HTML 和 BeautifulSoup 的两个表中提取一些数据。实际上,我已经从两个表中提取了部分但不是全部。这是我的代码:
from urllib.request import urlopen
from bs4 import BeautifulSoup
html_content = urlopen('https://www.icewarehouse.com/Bauer_Vapor_X25_Ice_Hockey_Skates/descpage-V25XS.html')
soup = BeautifulSoup(html_content, "lxml")
tables = soup.find_all('table', attrs={'class' : 'orderingtable fl'})
for table_skates in tables:
t_headers = []
t_data = []
t_row = {}
for tr in table_skates.find_all('th'):
t_headers.append(tr.text.replace('\n', '').strip())
for td in table_skates.find_all('td'):
t_data.append(td.text.replace('\n', '').strip())
t_row = dict(zip(t_headers, t_data))
print(t_row)
这是我得到的输出:
{'Size': '1.0', 'Price': '9.99', 'Stock': '1', 'Qty': ''}
{'Size': '7.0', 'Price': '9.99', 'Stock': '2+', 'Qty': ''}
在'pandas'中使用'read_html'即可轻松获取。
df = pd.read_html(html_content, attrs={'class' : 'orderingtable fl'})