如何从 wiki 获取特定列 table

How to get a specific columns from a wiki table

基本上我在这个页面上有table:https://en.wikipedia.org/wiki/List_of_cakes 我想从第一、第三和第四列中获取文本并将它们格式化为如下所示:

Amandine - 罗马尼亚 - 夹心巧克力、焦糖和软糖奶油的巧克力夹层蛋糕

到目前为止,我有这段代码是我根据 post:How do I extract text data in first column from Wikipedia table?.

修改的
from bs4 import BeautifulSoup

url = "https://en.wikipedia.org/wiki/List_of_cakes"

res = requests.get(url)
soup = BeautifulSoup(res.text,"lxml")
for items in soup.find(class_="wikitable").find_all("tr")[1:]:
    data = items.get_text(strip=True)
    print(data)

输出

AmandineRomaniaChocolate layered cake filled with chocolate, caramel and fondant cream
AmygdalopitaGreeceAlmond cake made with ground almonds, flour, butter, egg and pastry cream
Angel cakeUnited Kingdom[1]Sponge cake,cream,food colouring
Angel food cakeUnited StatesEgg whites, vanilla, andcream of tartar
etc...

我只是想抓取这个 wiki 页面并获得这些内容的文本文件,所以如果有人在我的 twitch 上使用命令 !cake 它会随机选择一个。

您离目标很近了,只有 find_all('td') 在您的行中,并按索引从 ResulSet:

中选择
for items in soup.find(class_="wikitable").find_all("tr")[1:]:
    e = items.find_all('td')
    data = f'{e[0].text.strip()} - {e[2].text.strip()} - {e[3].text.strip()}'
    print(data)

或使用list comprehension:

for items in soup.find(class_="wikitable").find_all("tr")[1:]:
    print(' - '.join([items.find_all('td')[i].get_text(strip=True) for i in [0,2,3]]))

例子

from bs4 import BeautifulSoup

url = "https://en.wikipedia.org/wiki/List_of_cakes"

res = requests.get(url)
soup = BeautifulSoup(res.text,"lxml")
for items in soup.find(class_="wikitable").find_all("tr")[1:]:
    e = items.find_all('td')
    data = f'{e[0].text.strip()} - {e[2].text.strip()} - {e[3].text.strip()}'
    print(data)

输出

Amandine - Romania - Chocolate layered cake filled with chocolate, caramel and fondant cream
Amygdalopita - Greece - Almond cake made with ground almonds, flour, butter, egg and pastry cream
Angel cake - United Kingdom[1] - Sponge cake, cream, food colouring
Angel food cake - United States - Egg whites, vanilla, and cream of tartar
Apple cake - Germany - Apple, caramel icing
Applesauce cake - Early colonial times in the New England Colonies of the Northeastern United States[2] - Prepared using apple sauce, flour and sugar as primary ingredients
Aranygaluska - Hungary - A cake with yeasty dough and vanilla custard