使用 BeautifulSoup 抓取维基百科 table

Question

我想从下面的维基百科 link 中抓取标题为“化学元素列表”的 table 并使用 pandas

显示它

https://en.wikipedia.org/wiki/List_of_chemical_elements

我是 beautifulsoup 的新手，这就是我目前拥有的。

from bs4 import BeautifulSoup
import requests as r
import pandas as pd

response = r.get('https://en.wikipedia.org/wiki/List_of_chemical_elements')

wiki_text = response.text

soup = BeautifulSoup(wiki_text, 'html.parser')

table_soup = soup.find_all('table')

Answer 1

您可以通过不同的方式 select table 和 beautifulsoup：

根据其“标题”：

soup.select_one('table:-soup-contains("List of chemical elements")')

树中顺序（第一个）：

soup.select_one('table')
soup.select('table')[0]

通过它的class（你的情况没有id）：
```
soup.select_one('table.wikitable')
```

或者简单地使用 `pandas`

pd.read_html('https://en.wikipedia.org/wiki/List_of_chemical_elements')[0]

*要得到预期的结果，请自己尝试，如果遇到困难，请提出新问题。

使用 BeautifulSoup 抓取维基百科 table

Scrape wikipedia table using BeautifulSoup

wikipedia

beautifulsoup

pandas

或者简单地使用 pandas

或者简单地使用 `pandas`