将抓取的列表提取到新列中

Extracting a scraped list into new columns

我有这个代码(从他网站上发布的一个旧问题借来的)

import pandas as pd
import json
import numpy as np
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://www.baseball-reference.com/leagues/MLB/2013-finalyear.shtml")
from bs4 import BeautifulSoup
doc = BeautifulSoup(driver.page_source, "html.parser")





#(The table has an id, it makes it more simple to target )
batting = doc.find(id='misc_batting')

careers = []
for row in batting.find_all('tr')[1:]:
    dictionary = {}
    dictionary['names'] = row.find(attrs = {"data-stat": "player"}).text.strip()
    dictionary['experience'] = row.find(attrs={"data-stat": "experience"}).text.strip()
    careers.append(dictionary)

生成如下结果:

[{'names': 'David Adams', 'experience': '1'}, {'names': 'Steve Ames', 'experience': '1'}, {'names': 'Rick Ankiel', 'experience': '11'}, {'names': 'Jairo Asencio', 'experience': '4'}, {'names': 'Luis Ayala', 'experience': '9'}, {'names': 'Brandon Bantz', 'experience': '1'}, {'names': 'Scott Barnes', 'experience': '2'}, {'names':

如何将其创建为这样的列分隔数据框?

Names       Experience
David Adams   1

只需将您的听写列表 (careers) 传递给 pandas.DataFrame() 即可获得预期结果。

例子

import pandas as pd

careers = [{'names': 'David Adams', 'experience': '1'}, {'names': 'Steve Ames', 'experience': '1'}, {'names': 'Rick Ankiel', 'experience': '11'}, {'names': 'Jairo Asencio', 'experience': '4'}, {'names': 'Luis Ayala', 'experience': '9'}, {'names': 'Brandon Bantz', 'experience': '1'}, {'names': 'Scott Barnes', 'experience': '2'}]

pd.DataFrame(careers)

输出

names experience
David Adams 1
Steve Ames 1
Rick Ankiel 11
Jairo Asencio 4
Luis Ayala 9
Brandon Bantz 1
Scott Barnes 2

您可以使用 pandas 大大简化此操作。让它拉 table,然后你只需要 NamesYrs 列。

import pandas as pd

url = "https://www.baseball-reference.com/leagues/MLB/2013-finalyear.shtml"
df = pd.read_html(url, attrs = {'id': 'misc_batting'})[0]

df_filter = df[['Name','Yrs']]

如果您需要重命名这些列,请添加:

df_filter = df_filter.rename(columns={'Name':'names','Yrs':'experience'})

输出:

print(df_filter)
              names  experience
0       David Adams           1
1        Steve Ames           1
2       Rick Ankiel          11
3     Jairo Asencio           4
4        Luis Ayala           9
..              ...         ...
209    Dewayne Wise          11
210       Ross Wolf           3
211  Kevin Youkilis          10
212   Michael Young          14
213          Totals        1357

[214 rows x 2 columns]