关于 .apply() 和通过函数传递整个列的问题
Question about .apply() and passing entire columns through functions
我有一个数据框,代码为 headers,行中有过去一年的每日调整收盘价,我想计算年化波动率,但我不确定如何通过这些列.当我 运行 此代码时出现异常:TypeError: 'Series' object is not callable
import pandas as pd
import datetime as datetime
from datetime import timedelta
import yfinance as yf
df = pd.read_excel('C:/Users/Jacob/Downloads/Benchmark Tickers.xlsx', sheet_name='Sheet1')
tickers_list = df['Ticker'].tolist()
data = pd.DataFrame(columns=tickers_list)
for ticker in tickers_list:
data[ticker] = yf.download(ticker, start=datetime.datetime.now()-datetime.timedelta(days=365), end=datetime.date.today()) ["Adj Close"]
def volatility(ticker):
return data[ticker].pct_change().rolling(252).std()*(252**0.5)
data[ticker].apply(volatility(ticker))
export_excel = data.to_excel(r'C:/Users/User/Downloads/testvol.xlsx', sheet_name='Sheet1', index= True)
我如何将此波动率函数应用于每一列?
这里是一个 link 的数据,当您执行 yfinance 拉取时被拉取:
https://docs.google.com/spreadsheets/d/11-kS1ah1lP8v6xv2JQZt_0i7YmynQxIw6q0stEvS_nM/edit?usp=sharing
当您在系列上调用 apply
时,您不会像通常调用函数时那样指定参数,而是将其作为 key-word 参数传递:
data[ticker] = data[ticker].apply(volatility, ticker=ticker)
那应该能解决这个问题。
pandas
文档也有一些很好的例子。参见 here。
- 在数据框上使用
.apply
时,默认情况下 axis=0
会为每一列执行计算 column-wise,因此无需指定每个代码名称。
import pandas as pd
import yfinance as yf
from datetime import datetime, timedelta, date
# given a list of tickers
tickers = ['EFT', 'PPR', 'SRLN']
# create empty dataframe with columns
data = pd.DataFrame(columns=tickers)
# get data
for ticker in tickers:
data[ticker] = yf.download(ticker, start=datetime.now()-timedelta(days=365), end=date.today()) ["Adj Close"]
# perform calculation for each ticker and add it to the calcs dataframe
# 252 is to many for rolling; that means it needs 252 rows to perform the calculation.
# to will be used to show that it's working
calcs = data.apply(lambda x: x.pct_change().rolling(10).std()*(10**0.5))
# display(calcs.head(20))
EFT PPR SRLN
Date
2019-08-20 NaN NaN NaN
2019-08-21 NaN NaN NaN
2019-08-22 NaN NaN NaN
2019-08-23 NaN NaN NaN
2019-08-26 NaN NaN NaN
2019-08-27 NaN NaN NaN
2019-08-28 NaN NaN NaN
2019-08-29 NaN NaN NaN
2019-08-30 NaN NaN NaN
2019-09-03 NaN NaN NaN
2019-09-04 0.009594 0.012125 0.004690
2019-09-05 0.009483 0.012122 0.004691
2019-09-06 0.009870 0.009697 0.004736
2019-09-09 0.009037 0.010020 0.004191
2019-09-10 0.009205 0.009544 0.003981
2019-09-11 0.006672 0.009543 0.004084
2019-09-12 0.006492 0.010054 0.003925
2019-09-13 0.005592 0.010049 0.003992
2019-09-16 0.005428 0.012274 0.003367
2019-09-17 0.004926 0.010776 0.002505
我有一个数据框,代码为 headers,行中有过去一年的每日调整收盘价,我想计算年化波动率,但我不确定如何通过这些列.当我 运行 此代码时出现异常:TypeError: 'Series' object is not callable
import pandas as pd
import datetime as datetime
from datetime import timedelta
import yfinance as yf
df = pd.read_excel('C:/Users/Jacob/Downloads/Benchmark Tickers.xlsx', sheet_name='Sheet1')
tickers_list = df['Ticker'].tolist()
data = pd.DataFrame(columns=tickers_list)
for ticker in tickers_list:
data[ticker] = yf.download(ticker, start=datetime.datetime.now()-datetime.timedelta(days=365), end=datetime.date.today()) ["Adj Close"]
def volatility(ticker):
return data[ticker].pct_change().rolling(252).std()*(252**0.5)
data[ticker].apply(volatility(ticker))
export_excel = data.to_excel(r'C:/Users/User/Downloads/testvol.xlsx', sheet_name='Sheet1', index= True)
我如何将此波动率函数应用于每一列?
这里是一个 link 的数据,当您执行 yfinance 拉取时被拉取: https://docs.google.com/spreadsheets/d/11-kS1ah1lP8v6xv2JQZt_0i7YmynQxIw6q0stEvS_nM/edit?usp=sharing
当您在系列上调用 apply
时,您不会像通常调用函数时那样指定参数,而是将其作为 key-word 参数传递:
data[ticker] = data[ticker].apply(volatility, ticker=ticker)
那应该能解决这个问题。
pandas
文档也有一些很好的例子。参见 here。
- 在数据框上使用
.apply
时,默认情况下axis=0
会为每一列执行计算 column-wise,因此无需指定每个代码名称。
import pandas as pd
import yfinance as yf
from datetime import datetime, timedelta, date
# given a list of tickers
tickers = ['EFT', 'PPR', 'SRLN']
# create empty dataframe with columns
data = pd.DataFrame(columns=tickers)
# get data
for ticker in tickers:
data[ticker] = yf.download(ticker, start=datetime.now()-timedelta(days=365), end=date.today()) ["Adj Close"]
# perform calculation for each ticker and add it to the calcs dataframe
# 252 is to many for rolling; that means it needs 252 rows to perform the calculation.
# to will be used to show that it's working
calcs = data.apply(lambda x: x.pct_change().rolling(10).std()*(10**0.5))
# display(calcs.head(20))
EFT PPR SRLN
Date
2019-08-20 NaN NaN NaN
2019-08-21 NaN NaN NaN
2019-08-22 NaN NaN NaN
2019-08-23 NaN NaN NaN
2019-08-26 NaN NaN NaN
2019-08-27 NaN NaN NaN
2019-08-28 NaN NaN NaN
2019-08-29 NaN NaN NaN
2019-08-30 NaN NaN NaN
2019-09-03 NaN NaN NaN
2019-09-04 0.009594 0.012125 0.004690
2019-09-05 0.009483 0.012122 0.004691
2019-09-06 0.009870 0.009697 0.004736
2019-09-09 0.009037 0.010020 0.004191
2019-09-10 0.009205 0.009544 0.003981
2019-09-11 0.006672 0.009543 0.004084
2019-09-12 0.006492 0.010054 0.003925
2019-09-13 0.005592 0.010049 0.003992
2019-09-16 0.005428 0.012274 0.003367
2019-09-17 0.004926 0.010776 0.002505