Web 抓取数据框。添加列时出错

Web scraping into a dataframe. Getting error when adding a column

我的错误:"list indices must be integers or slices, not list"

我知道与该错误相关的帖子似乎无穷无尽,但我已经搜索过但无法弄清楚。如果有我遗漏的解决方案可以帮助我,请告诉我。

总之...

我正在使用 pandas 通过网络将股票信息抓取到数据框中,然后在末尾添加两个计算列:

  1. 计算价格
  2. 另一个不相关的计算

我遇到的问题是当我尝试添加第一个计算列(代码的最后一位)时:

# Dependencies
from bs4 import BeautifulSoup
import requests
import pandas as pd

url = 'http://www.dividend.com/dividend-stocks/preferred-dividend-stocks.php#stocks&sort_name=Symbol&sort_order=ASC&page=1'
tables = pd.read_html(url)
tables

type(tables)
type(tables[0])
tables[0].head()

tables['Perp Value'] = (1/(tables['Dividend Yield']/100))*tables['Annual Dividend']
tables[0].head()

当我尝试添加列 'Perp Value' 时出现错误。我需要添加什么才能进行计算?

作为参考,未格式化的数据如下所示:

\nDividend Yield\n \nCurrent Price\n \nAnnual Dividend\n \n52-Week High\n \ 0 8.04% .65 .06 25.98
1 7.61% .47 .94 25.95
2 6.66% .82 .72 28.80
3 7.47% .95 .94 26.87
4 5.78% .72 .49 28.99
5 8.06% .20 .11 26.00
6 7.72% .87 .84 0.00
7 7.75% .80 .84 0.00
8 7.80% .05 .88 0.00
9

试试这个:

查找列名称:

list(tables[0])

['\nStock Symbol\n',
'\nCompany Name\n',
'\nDividend Yield\n',
'\nCurrent Price\n',
'\nAnnual Dividend\n',
'\n52-Week High\n',
'\n52-Week Low\n']

清理数据并转换为数字:

a = tables[0][list(tables[0])[4]]
a = a.replace('[$,]', '', regex=True).astype(float)
b = tables[0][list(tables[0])[2]]
b = b.replace('[\%,]', '', regex=True).astype(float)

输出:

(1/a/100)*b

0     0.039029
1     0.039227
2     0.038721
3     0.038505
4     0.038792
5     0.038199
6     0.041957
7     0.042120
8     0.041489
9     0.041649
10    0.038594
11    0.039053
12    0.038795
13    0.037853
14    0.037546
15    0.039100
16    0.039320
17    0.039898
18    0.038431
19    0.040290
dtype: float64

分配:

tables[0]["Perp Value"] = (1/a/100)*b