如何根据多个股票数据框的特定日期范围计算财务比率?

How to compute financial ratio based on specific date range for multiple stocks dataframe?

我正在尝试编写一个代码来计算 Sharpe ratio 每个日历季度的股票列表。这是我的数据框 (df) 的示例:

DataFrame

所以,首先,我编写了一个计算夏普比率的函数:

def compute_sr(df, risk_free_rate=0):

mean_return = df.mean()
std = df.std()
sharpe_ratio = (mean_return - risk_free_rate)/std

# quarterly annualized Sharpe Ratio
return sharpe_ratio * np.sqrt(62)

然后,我编写了一个代码,根据日历季度为股票列表中的每只股票计算夏普比率,并将结果添加到新列中:

start = '2018-01-01'

end = '2021-12-21'

list_of_stocks = ['AAPL', 'MSFT', 'FB']

date_range = pd.date_range(start=start, end=end)

for stock in list_of_stocks:
df['Sharpe Ratio'] = df[df['Symbol'] == stock]['Log Return'][compute_sr(df.loc[date_range - pd.offsets.DateOffset(months=3): date_range, 'Log Return']) 
         for date_range in df.index]

不幸的是,我遇到了以下语法错误:

File "<ipython-input-211-a78d09294412>", line 3
for date_range in trading_data.index]
^
SyntaxError: invalid syntax

语法错误的原因是什么?

提前致谢。

我认为你在这里遇到了三个问题:你认为你在用日期做什么,你的 compute_sr 函数和你的股票循环中的这个乱七八糟的东西。

首先,日期:

当您执行 date_range = pd.date_range(start=start, end=end) 时,您会得到一个数组。因此,您将无法使用 .loc select 数据并获取要作为参数传递给 compute_sr 函数的日期帧。

你有:

df.loc[an_array: another_array, 'Log Return']

Pandas 无法弄清楚要 returned 的数据帧索引从哪里开始和从哪里结束。你需要:

df.loc[a_start_date:an_end_date, 'Log Return']

但我们假设您将获得一个数据框,您将传递给 compute_sr。这是它正在做的事情:

df = pd.DataFrame({'Date':['2018-01-03','2018-01-04','2018-01-05','2018-01-08','2018-01-09','2018-01-03','2018-01-04','2018-01-05','2018-01-08','2018-01-09'], 'Symbol':['AAPL','AAPL','AAPL','AAPL','AAPL','GOOGL''GOOGL','GOOGL','GOOGL','GOOGL'], 'AdjClose':[41.12,41.31,41.79,41.63,41.63,2265.89,2465.89,2625.89,2565.89,2165.89],'LogReturn':[-0.000175,0.004634,0.011321,-0.003721,-0.000115,-0.003452,0.004111,0.032111,0.003721,-0.000115]})

# calculate_sr
mean_return = df[df['Symbol'] == stock]['LogReturn'].mean()
std = df[df['Symbol'] == stock]['LogReturn'].std()
sharpe_ratio = (mean_return - 0)/std
print(sharpe_ratio)

0.4111951194083935

所以,那条乱七八糟的线实际上是:

df1 = df[df['Symbol'] == stock]['Log Return']
# compute_sr will resolve to `sharpe_ratio`
sharpe_ratio = 0.4111951194083935
df1[sharpe_ratio]

这将 return 类型错误,因为 0.41119... 不在索引或列名中。

您可以执行以下操作:

df['Date'] = df['Date'].apply(pd.Timestamp)
# Get an year and quarter for each value for 'Date'
df['Quarter'] = df['Date'].apply(lambda x: x.quarter)
df['Year'] = df['Date'].apply(lambda x: x.year)

# Filter dataframe to get only data between 2018 and 2021
df = df[(df['Year'] > 2017) & (df['Year'] < 2022)]
quarterly = df.groupby(['Symbol', 'Year', 'Quarter'])

# 
def compute_sr(x, risk_free_rate=0):
    mean_return = x.mean()
    std = x.std()
    sharpe_ratio = (mean_return - risk_free_rate)/std

    # quarterly annualized Sharpe Ratio
    return sharpe_ratio * np.sqrt(62)

sr = quarterly['LogReturn'].apply(compute_sr)
print(sr)

Symbol  Year  Quarter
AAPL    2018  1         3.237754
GOOGL   2018  1         4.027694
Name: LogReturn, dtype: float64