如何根据多个股票数据框的特定日期范围计算财务比率?
How to compute financial ratio based on specific date range for multiple stocks dataframe?
我正在尝试编写一个代码来计算 Sharpe ratio 每个日历季度的股票列表。这是我的数据框 (df) 的示例:
DataFrame
所以,首先,我编写了一个计算夏普比率的函数:
def compute_sr(df, risk_free_rate=0):
mean_return = df.mean()
std = df.std()
sharpe_ratio = (mean_return - risk_free_rate)/std
# quarterly annualized Sharpe Ratio
return sharpe_ratio * np.sqrt(62)
然后,我编写了一个代码,根据日历季度为股票列表中的每只股票计算夏普比率,并将结果添加到新列中:
start = '2018-01-01'
end = '2021-12-21'
list_of_stocks = ['AAPL', 'MSFT', 'FB']
date_range = pd.date_range(start=start, end=end)
for stock in list_of_stocks:
df['Sharpe Ratio'] = df[df['Symbol'] == stock]['Log Return'][compute_sr(df.loc[date_range - pd.offsets.DateOffset(months=3): date_range, 'Log Return'])
for date_range in df.index]
不幸的是,我遇到了以下语法错误:
File "<ipython-input-211-a78d09294412>", line 3
for date_range in trading_data.index]
^
SyntaxError: invalid syntax
语法错误的原因是什么?
提前致谢。
我认为你在这里遇到了三个问题:你认为你在用日期做什么,你的 compute_sr
函数和你的股票循环中的这个乱七八糟的东西。
首先,日期:
当您执行 date_range = pd.date_range(start=start, end=end)
时,您会得到一个数组。因此,您将无法使用 .loc
select 数据并获取要作为参数传递给 compute_sr
函数的日期帧。
你有:
df.loc[an_array: another_array, 'Log Return']
Pandas 无法弄清楚要 returned 的数据帧索引从哪里开始和从哪里结束。你需要:
df.loc[a_start_date:an_end_date, 'Log Return']
但我们假设您将获得一个数据框,您将传递给 compute_sr
。这是它正在做的事情:
df = pd.DataFrame({'Date':['2018-01-03','2018-01-04','2018-01-05','2018-01-08','2018-01-09','2018-01-03','2018-01-04','2018-01-05','2018-01-08','2018-01-09'], 'Symbol':['AAPL','AAPL','AAPL','AAPL','AAPL','GOOGL''GOOGL','GOOGL','GOOGL','GOOGL'], 'AdjClose':[41.12,41.31,41.79,41.63,41.63,2265.89,2465.89,2625.89,2565.89,2165.89],'LogReturn':[-0.000175,0.004634,0.011321,-0.003721,-0.000115,-0.003452,0.004111,0.032111,0.003721,-0.000115]})
# calculate_sr
mean_return = df[df['Symbol'] == stock]['LogReturn'].mean()
std = df[df['Symbol'] == stock]['LogReturn'].std()
sharpe_ratio = (mean_return - 0)/std
print(sharpe_ratio)
0.4111951194083935
所以,那条乱七八糟的线实际上是:
df1 = df[df['Symbol'] == stock]['Log Return']
# compute_sr will resolve to `sharpe_ratio`
sharpe_ratio = 0.4111951194083935
df1[sharpe_ratio]
这将 return 类型错误,因为 0.41119...
不在索引或列名中。
您可以执行以下操作:
df['Date'] = df['Date'].apply(pd.Timestamp)
# Get an year and quarter for each value for 'Date'
df['Quarter'] = df['Date'].apply(lambda x: x.quarter)
df['Year'] = df['Date'].apply(lambda x: x.year)
# Filter dataframe to get only data between 2018 and 2021
df = df[(df['Year'] > 2017) & (df['Year'] < 2022)]
quarterly = df.groupby(['Symbol', 'Year', 'Quarter'])
#
def compute_sr(x, risk_free_rate=0):
mean_return = x.mean()
std = x.std()
sharpe_ratio = (mean_return - risk_free_rate)/std
# quarterly annualized Sharpe Ratio
return sharpe_ratio * np.sqrt(62)
sr = quarterly['LogReturn'].apply(compute_sr)
print(sr)
Symbol Year Quarter
AAPL 2018 1 3.237754
GOOGL 2018 1 4.027694
Name: LogReturn, dtype: float64
我正在尝试编写一个代码来计算 Sharpe ratio 每个日历季度的股票列表。这是我的数据框 (df) 的示例:
DataFrame
所以,首先,我编写了一个计算夏普比率的函数:
def compute_sr(df, risk_free_rate=0):
mean_return = df.mean()
std = df.std()
sharpe_ratio = (mean_return - risk_free_rate)/std
# quarterly annualized Sharpe Ratio
return sharpe_ratio * np.sqrt(62)
然后,我编写了一个代码,根据日历季度为股票列表中的每只股票计算夏普比率,并将结果添加到新列中:
start = '2018-01-01'
end = '2021-12-21'
list_of_stocks = ['AAPL', 'MSFT', 'FB']
date_range = pd.date_range(start=start, end=end)
for stock in list_of_stocks:
df['Sharpe Ratio'] = df[df['Symbol'] == stock]['Log Return'][compute_sr(df.loc[date_range - pd.offsets.DateOffset(months=3): date_range, 'Log Return'])
for date_range in df.index]
不幸的是,我遇到了以下语法错误:
File "<ipython-input-211-a78d09294412>", line 3
for date_range in trading_data.index]
^
SyntaxError: invalid syntax
语法错误的原因是什么?
提前致谢。
我认为你在这里遇到了三个问题:你认为你在用日期做什么,你的 compute_sr
函数和你的股票循环中的这个乱七八糟的东西。
首先,日期:
当您执行 date_range = pd.date_range(start=start, end=end)
时,您会得到一个数组。因此,您将无法使用 .loc
select 数据并获取要作为参数传递给 compute_sr
函数的日期帧。
你有:
df.loc[an_array: another_array, 'Log Return']
Pandas 无法弄清楚要 returned 的数据帧索引从哪里开始和从哪里结束。你需要:
df.loc[a_start_date:an_end_date, 'Log Return']
但我们假设您将获得一个数据框,您将传递给 compute_sr
。这是它正在做的事情:
df = pd.DataFrame({'Date':['2018-01-03','2018-01-04','2018-01-05','2018-01-08','2018-01-09','2018-01-03','2018-01-04','2018-01-05','2018-01-08','2018-01-09'], 'Symbol':['AAPL','AAPL','AAPL','AAPL','AAPL','GOOGL''GOOGL','GOOGL','GOOGL','GOOGL'], 'AdjClose':[41.12,41.31,41.79,41.63,41.63,2265.89,2465.89,2625.89,2565.89,2165.89],'LogReturn':[-0.000175,0.004634,0.011321,-0.003721,-0.000115,-0.003452,0.004111,0.032111,0.003721,-0.000115]})
# calculate_sr
mean_return = df[df['Symbol'] == stock]['LogReturn'].mean()
std = df[df['Symbol'] == stock]['LogReturn'].std()
sharpe_ratio = (mean_return - 0)/std
print(sharpe_ratio)
0.4111951194083935
所以,那条乱七八糟的线实际上是:
df1 = df[df['Symbol'] == stock]['Log Return']
# compute_sr will resolve to `sharpe_ratio`
sharpe_ratio = 0.4111951194083935
df1[sharpe_ratio]
这将 return 类型错误,因为 0.41119...
不在索引或列名中。
您可以执行以下操作:
df['Date'] = df['Date'].apply(pd.Timestamp)
# Get an year and quarter for each value for 'Date'
df['Quarter'] = df['Date'].apply(lambda x: x.quarter)
df['Year'] = df['Date'].apply(lambda x: x.year)
# Filter dataframe to get only data between 2018 and 2021
df = df[(df['Year'] > 2017) & (df['Year'] < 2022)]
quarterly = df.groupby(['Symbol', 'Year', 'Quarter'])
#
def compute_sr(x, risk_free_rate=0):
mean_return = x.mean()
std = x.std()
sharpe_ratio = (mean_return - risk_free_rate)/std
# quarterly annualized Sharpe Ratio
return sharpe_ratio * np.sqrt(62)
sr = quarterly['LogReturn'].apply(compute_sr)
print(sr)
Symbol Year Quarter
AAPL 2018 1 3.237754
GOOGL 2018 1 4.027694
Name: LogReturn, dtype: float64