在 Python 中导入历史 S&P 500 数据
Importing Historical S&P 500 Data in Python
我正在尝试提取所有标准普尔 500 股票的历史股票数据,例如开盘价和成交量,然后打印数据。但是,我的代码有缺陷。当我 运行 我的代码时,我收到一条消息说“AttributeError Traceback(最近调用最后一次)”,以及“AttributeError:'DataFrame' 对象没有属性 'split'”。我哪里错了?
import pandas as pd
table=pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
df = table[0]
df.to_csv('S&P500-Info.csv')
df.to_csv("S&P500-Symbols.csv", columns=['Symbol'])
col_list = ["Symbol"]
df = pd.read_csv("S&P500-Symbols.csv", usecols=col_list)
stockdata = (df)
!pip install yfinance
import yfinance as yf
full_stock_data = yf.download(stockdata,'2010-01-01','2021-03-03')
print(full_stock_data)
您在一列中有一个代码列表。现在,您要求 yfinance 在多天内为每个代码提取多个列。
封闭是 1 个股票代码的输出。请注意,符号未包含在数据中。
[100%**] 1 个中的 1 个已完成
高开低收 Adj Close
日期
2010-01-04 26.000362 26.177889 25.870815 26.129908 20.074522
2010-01-05 26.134706 26.134706 25.789249 25.918797 19.912334
2010-01-06 25.880411 26.096321 25.837231 26.062737 20.022911
2010-01-07 26.057938 26.283443 25.942785 26.278646 20.188793
2010-01-08 26.273848 26.508949 26.235464 26.412991 20.292002
……………………
2021-02-24 120.800003 122.910004 120.660004 122.379997 122.379997
2021-02-25 121.660004 122.760002 120.769997 121.580002 121.580002
2021-02-26 122.190002 122.360001 119.660004 119.779999 119.779999
2021-03-01 120.989998 122.949997 120.610001 122.209999 122.209999
2021-03-02 122.209999 123.099998 121.190002 122.529999 122.529999
Volume
日期
2010-01-04 10829095
2010-01-05 10562109
2010-01-06 11401417
2010-01-07 12857232
2010-01-08 12148604
…………
2021-02-24 4127400
2021-02-25 3468900
2021-02-26 5197500
2021-03-01 3858600
2021-03-02 4908100
[2809 行 x 6 列]
你能做的就是这样
import pandas_datareader as web
import pandas as pd
#get your symbols using your code - mine is stocks
dfAdjClose = web.DataReader(stocks, "yahoo", start="2010-01-01", end="2021-01-31")
["Adj Close"]
repeat for other data then you can do code to combine by symbol
yf.download
需要 python 列表而不是 pd.Series。我认为你的 csv-construct 可以工作,但根本不需要它。这应该有效:
table = pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
df = table[0]
stockdata = df['Symbol'].to_list()
full_stock_data = yf.download(stockdata, '2010-01-01', '2021-03-03')
print(full_stock_data['Volume'])
我正在尝试提取所有标准普尔 500 股票的历史股票数据,例如开盘价和成交量,然后打印数据。但是,我的代码有缺陷。当我 运行 我的代码时,我收到一条消息说“AttributeError Traceback(最近调用最后一次)”,以及“AttributeError:'DataFrame' 对象没有属性 'split'”。我哪里错了?
import pandas as pd
table=pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
df = table[0]
df.to_csv('S&P500-Info.csv')
df.to_csv("S&P500-Symbols.csv", columns=['Symbol'])
col_list = ["Symbol"]
df = pd.read_csv("S&P500-Symbols.csv", usecols=col_list)
stockdata = (df)
!pip install yfinance
import yfinance as yf
full_stock_data = yf.download(stockdata,'2010-01-01','2021-03-03')
print(full_stock_data)
您在一列中有一个代码列表。现在,您要求 yfinance 在多天内为每个代码提取多个列。
封闭是 1 个股票代码的输出。请注意,符号未包含在数据中。
[100%**] 1 个中的 1 个已完成
高开低收 Adj Close
日期
2010-01-04 26.000362 26.177889 25.870815 26.129908 20.074522
2010-01-05 26.134706 26.134706 25.789249 25.918797 19.912334
2010-01-06 25.880411 26.096321 25.837231 26.062737 20.022911
2010-01-07 26.057938 26.283443 25.942785 26.278646 20.188793
2010-01-08 26.273848 26.508949 26.235464 26.412991 20.292002
……………………
2021-02-24 120.800003 122.910004 120.660004 122.379997 122.379997
2021-02-25 121.660004 122.760002 120.769997 121.580002 121.580002
2021-02-26 122.190002 122.360001 119.660004 119.779999 119.779999
2021-03-01 120.989998 122.949997 120.610001 122.209999 122.209999
2021-03-02 122.209999 123.099998 121.190002 122.529999 122.529999
Volume
日期
2010-01-04 10829095
2010-01-05 10562109
2010-01-06 11401417
2010-01-07 12857232
2010-01-08 12148604
…………
2021-02-24 4127400
2021-02-25 3468900
2021-02-26 5197500
2021-03-01 3858600
2021-03-02 4908100
[2809 行 x 6 列]
你能做的就是这样
import pandas_datareader as web
import pandas as pd
#get your symbols using your code - mine is stocks
dfAdjClose = web.DataReader(stocks, "yahoo", start="2010-01-01", end="2021-01-31")
["Adj Close"]
repeat for other data then you can do code to combine by symbol
yf.download
需要 python 列表而不是 pd.Series。我认为你的 csv-construct 可以工作,但根本不需要它。这应该有效:
table = pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
df = table[0]
stockdata = df['Symbol'].to_list()
full_stock_data = yf.download(stockdata, '2010-01-01', '2021-03-03')
print(full_stock_data['Volume'])