使用 python 和 pandas 回测交易策略 - 一次只识别一个未平仓头寸
Backtesting Trading Strategy with python and pandas - Recognizing only one open position at a time
这是一篇很长的读物,但我在 Whosebug 上浏览了很多关于创建函数以迭代 DataFrames 等的示例,但找不到适合我需要的内容。我也只使用 python 和一般编码大约 2 个月,所以如果有什么不清楚的地方,我深表歉意。
我有一个包含每日价格历史的数据框,我正在尝试根据此策略为买入信号创建回溯测试:
我们先找收盘价大于前一天和后一天收盘价的那一天.我们称其为 "base day."
为了启动我们的买入信号,我们等待收盘价回到 "base day." 以上的那一天,我们现在有一个未平仓头寸。
我们持有该头寸,直到收到与我们的买入信号相反的卖出信号。 (即收盘价低于前一天,而前一天和后一天均较高)
我只希望一次激活一个买入,直到我们收到卖出信号,然后流程重新开始。
下面是一个示例数据框,其中包含我正在查看的一小部分数据
import pandas as pd
data = {
'date': [1/3/2000,1/4/2000,1/5/2000,1/6/2000,1/7/2000,1/10/2000,1/11/2000,1/12/2000,1/13/2000,1/14/2000,1/18/2000,1/19/2000,1/20/2000,1/21/2000,1/24/2000,1/25/2000,1/26/2000,1/27/2000,1/28/2000,1/31/2000,2/1/2000,2/2/2000,2/3/2000,2/4/2000,2/7/2000,2/8/2000,2/9/2000,2/10/2000,2/11/2000,2/14/2000,2/15/2000,2/16/2000,2/17/2000,2/18/2000,2/22/2000,2/23/2000,2/24/2000,2/25/2000,2/28/2000,2/29/2000],
'close': [308.3,315.3,314.4,307.5,309.8,313.4,310.7,324.2,332.5,348.8,351.1,348.2,348.7,343.5,343,343.3,342.4,343,334.4,334.6,336,333.8,331.6,332.8,335.9,341.2,338.4,342.1,343.2,339.5,346.9,342,339.6,337.4,335,330.8,331.3,331.1,332.6,335.1]}
df = pd.DataFrame(data)
## Create columns to compare price to day before and day after
df['prev_close'] = df['close'].shift(1)
df['next_close'] = df['close'].shift(-1)
## BOOLEAN TO RETURN IF PRICE IS LOWER THAN PREVIOUS AND NEXT DAY
df['high_high'] = ((df['prev_close']) > df['close']) & ((df['next_close']) > df['close'])
## BOOLEAN TO RETURN TRUE IF PRICE IS GREATER THAN PREVIOUS AND NEXT DAY
df['low_low'] = ((df['prev_close']) < df['close']) & ((df['next_close']) < df['close'])
## RETURN PRICE OF MOST RECENT true IN low_low
df['comp_price'] = df['close'].where(df['low_low'] == True)
## FILL IN BLANKS WITH PREVIOUS VALUE TO KEEP COMPARISON PRICE ACTIVE
df['comp_price'].fillna(method='pad',inplace=True)
## CREATE SELL COMPARISON DATE TO REFERENCE WHEN CLOSING POSITION
df['sell_comp'] = df['close'].where(df['high_high'] == True)
df['sell_comp'].fillna(method='pad',inplace=True)
## CREATE BUY SIGNAL
df['buy_sig'] = df['close'] > df['comp_price']
## DESIGNATE FIRST INSTANCE OF BUY SIGNAL AS DAY TO OPEN POSITION
df['open_pos'] = (df['buy_sig'] == 1) & (df['buy_sig'].shift(1) != 1)
df['take_signal'] = (df['buy_sig'] == 1) & (df['open_pos'] == True)
df['open_pos_price'] = df['close'].where(df['take_signal'] == True)
df['open_pos_price'].fillna(method='pad',inplace=True)
## CREATE SELL SIGNAL
df['sell_sig'] = df['close'] < df['sell_comp']
## DESIGNATE FIRST INSTANCE OF SELL AS DAY TO CLOSE POSITION
df['close_pos'] = (df['sell_sig'] == True) & (df['sell_sig'].shift(1) == False)
## CREATE COLUMNS THAT ORGANIZE WHEN POSITION WAS OPENED
df['open_pos_date'] = df['date'].where((df['open_pos'] == True)&(df['take_signal'] == True))
df['open_pos_date'].fillna(method='pad',inplace=True)
## CREATE COLUMNS SHOW DATE AND PRICE OF CLOSING POSITION
df['close_pos_price'] = df['close'].where(df['close_pos'] == True)
df['close_pos_date'] = df['date'].where((df['close_pos'] == True))
## CALCULATE GAIN FOR TRADE
df['gain'] = (df['close_pos_price'] - df['open_pos_price']).where((df['close_pos_price'] > 0)& (df['open_pos_price'] > 0))
然后我创建了另一个数据框,当我收到卖出信号时显示结果,这样我可以稍后将结果转换为元组并迭代以添加交易成本等,以完成图表目的。
strat_df = df.loc[(df['close_pos'] == True)&(df['sell_sig'] == True), ['open_pos_date','open_pos_price', 'close_pos_date','close_pos_price','gain']]
我看到相同 open_pos_date
的多个实例具有不同的 close_pos_date
值。在此过程中,我允许多个空缺职位工作。
我想将我的第一个买入信号作为我唯一的头寸,忽略所有其他买入信号,直到我收到卖出信号。那时我想寻找一个新的买入信号并持有那个头寸直到我得到一个新的卖出信号。
我可能创建了比必要更多的列,但我很难找到一种方法来获得独特的持仓信号,然后将价格与我收到卖出信号时的价格进行比较.如果有人可以推荐一种更简洁的方法来执行此操作,我很乐意放弃我的第一次尝试并试一试。
虽然您通常希望避免遍历数据帧的行,因为它非常缓慢且效率低下,但我发现这通常是回测时的最佳方法。由于您的头寸和投资组合价值取决于 T-1 值以计算 T 值,因此通常需要逐行计算,而且要简单得多。
import pandas as pd
data = {'date': ['1/3/2000','1/4/2000','1/5/2000','1/6/2000','1/7/2000','1/10/2000',
'1/11/2000','1/12/2000','1/13/2000','1/14/2000','1/18/2000','1/19/2000','1/20/2000','1/21/2000',
'1/24/2000','1/25/2000','1/26/2000','1/27/2000','1/28/2000','1/31/2000','2/1/2000','2/2/2000',
'2/3/2000','2/4/2000','2/7/2000','2/8/2000','2/9/2000','2/10/2000','2/11/2000','2/14/2000',
'2/15/2000','2/16/2000','2/17/2000','2/18/2000','2/22/2000','2/23/2000','2/24/2000','2/25/2000',
'2/28/2000','2/29/2000'],
'close': [308.3,315.3,314.4,307.5,309.8,313.4,310.7,324.2,332.5,348.8,351.1,348.2,348.7,343.5,343,343.3,342.4,343,
334.4,334.6,336,333.8,331.6,332.8,335.9,341.2,338.4,342.1,343.2,339.5,346.9,342,339.6,337.4,335,330.8,331.3,
331.1,332.6,335.1]}
df = pd.DataFrame(data)
df = df.set_index(['date'])
df['pos'] = 0
base_buy = 999999.0
base_sell = 0.0
for i in range(2, df.shape[0] - 1):
px_m1 = df.iloc[i - 1].loc['close']
px = df.iloc[i].loc['close']
px_p1 = df.iloc[i + 1].loc['close']
pos = df.iloc[i - 1].loc['pos']
#base_buy
if px > px_m1 and px > px_p1 and pos == 0:
base_buy = px
#entry signal
if px > base_buy and pos == 0:
pos = 1.0
base_sell = 0.0
#base_sell
if px < px_m1 and px < px_p1 and pos == 1:
base_sell = px
#exit signal
if px < base_sell and pos == 1.0:
pos = 0.0
base_buy = 999999.0
df.iloc[i, 1] = pos
print(df)
输出:
close pos
date
1/3/2000 308.3 0.0
1/4/2000 315.3 0.0
1/5/2000 314.4 0.0
1/6/2000 307.5 0.0
1/7/2000 309.8 0.0
1/10/2000 313.4 0.0
1/11/2000 310.7 0.0
1/12/2000 324.2 1.0
1/13/2000 332.5 1.0
1/14/2000 348.8 1.0
1/18/2000 351.1 1.0
1/19/2000 348.2 1.0
1/20/2000 348.7 1.0
1/21/2000 343.5 0.0
1/24/2000 343.0 0.0
1/25/2000 343.3 0.0
这是一篇很长的读物,但我在 Whosebug 上浏览了很多关于创建函数以迭代 DataFrames 等的示例,但找不到适合我需要的内容。我也只使用 python 和一般编码大约 2 个月,所以如果有什么不清楚的地方,我深表歉意。
我有一个包含每日价格历史的数据框,我正在尝试根据此策略为买入信号创建回溯测试:
我们先找收盘价大于前一天和后一天收盘价的那一天.我们称其为 "base day."
为了启动我们的买入信号,我们等待收盘价回到 "base day." 以上的那一天,我们现在有一个未平仓头寸。
我们持有该头寸,直到收到与我们的买入信号相反的卖出信号。 (即收盘价低于前一天,而前一天和后一天均较高)
我只希望一次激活一个买入,直到我们收到卖出信号,然后流程重新开始。
下面是一个示例数据框,其中包含我正在查看的一小部分数据
import pandas as pd
data = {
'date': [1/3/2000,1/4/2000,1/5/2000,1/6/2000,1/7/2000,1/10/2000,1/11/2000,1/12/2000,1/13/2000,1/14/2000,1/18/2000,1/19/2000,1/20/2000,1/21/2000,1/24/2000,1/25/2000,1/26/2000,1/27/2000,1/28/2000,1/31/2000,2/1/2000,2/2/2000,2/3/2000,2/4/2000,2/7/2000,2/8/2000,2/9/2000,2/10/2000,2/11/2000,2/14/2000,2/15/2000,2/16/2000,2/17/2000,2/18/2000,2/22/2000,2/23/2000,2/24/2000,2/25/2000,2/28/2000,2/29/2000],
'close': [308.3,315.3,314.4,307.5,309.8,313.4,310.7,324.2,332.5,348.8,351.1,348.2,348.7,343.5,343,343.3,342.4,343,334.4,334.6,336,333.8,331.6,332.8,335.9,341.2,338.4,342.1,343.2,339.5,346.9,342,339.6,337.4,335,330.8,331.3,331.1,332.6,335.1]}
df = pd.DataFrame(data)
## Create columns to compare price to day before and day after
df['prev_close'] = df['close'].shift(1)
df['next_close'] = df['close'].shift(-1)
## BOOLEAN TO RETURN IF PRICE IS LOWER THAN PREVIOUS AND NEXT DAY
df['high_high'] = ((df['prev_close']) > df['close']) & ((df['next_close']) > df['close'])
## BOOLEAN TO RETURN TRUE IF PRICE IS GREATER THAN PREVIOUS AND NEXT DAY
df['low_low'] = ((df['prev_close']) < df['close']) & ((df['next_close']) < df['close'])
## RETURN PRICE OF MOST RECENT true IN low_low
df['comp_price'] = df['close'].where(df['low_low'] == True)
## FILL IN BLANKS WITH PREVIOUS VALUE TO KEEP COMPARISON PRICE ACTIVE
df['comp_price'].fillna(method='pad',inplace=True)
## CREATE SELL COMPARISON DATE TO REFERENCE WHEN CLOSING POSITION
df['sell_comp'] = df['close'].where(df['high_high'] == True)
df['sell_comp'].fillna(method='pad',inplace=True)
## CREATE BUY SIGNAL
df['buy_sig'] = df['close'] > df['comp_price']
## DESIGNATE FIRST INSTANCE OF BUY SIGNAL AS DAY TO OPEN POSITION
df['open_pos'] = (df['buy_sig'] == 1) & (df['buy_sig'].shift(1) != 1)
df['take_signal'] = (df['buy_sig'] == 1) & (df['open_pos'] == True)
df['open_pos_price'] = df['close'].where(df['take_signal'] == True)
df['open_pos_price'].fillna(method='pad',inplace=True)
## CREATE SELL SIGNAL
df['sell_sig'] = df['close'] < df['sell_comp']
## DESIGNATE FIRST INSTANCE OF SELL AS DAY TO CLOSE POSITION
df['close_pos'] = (df['sell_sig'] == True) & (df['sell_sig'].shift(1) == False)
## CREATE COLUMNS THAT ORGANIZE WHEN POSITION WAS OPENED
df['open_pos_date'] = df['date'].where((df['open_pos'] == True)&(df['take_signal'] == True))
df['open_pos_date'].fillna(method='pad',inplace=True)
## CREATE COLUMNS SHOW DATE AND PRICE OF CLOSING POSITION
df['close_pos_price'] = df['close'].where(df['close_pos'] == True)
df['close_pos_date'] = df['date'].where((df['close_pos'] == True))
## CALCULATE GAIN FOR TRADE
df['gain'] = (df['close_pos_price'] - df['open_pos_price']).where((df['close_pos_price'] > 0)& (df['open_pos_price'] > 0))
然后我创建了另一个数据框,当我收到卖出信号时显示结果,这样我可以稍后将结果转换为元组并迭代以添加交易成本等,以完成图表目的。
strat_df = df.loc[(df['close_pos'] == True)&(df['sell_sig'] == True), ['open_pos_date','open_pos_price', 'close_pos_date','close_pos_price','gain']]
我看到相同 open_pos_date
的多个实例具有不同的 close_pos_date
值。在此过程中,我允许多个空缺职位工作。
我想将我的第一个买入信号作为我唯一的头寸,忽略所有其他买入信号,直到我收到卖出信号。那时我想寻找一个新的买入信号并持有那个头寸直到我得到一个新的卖出信号。
我可能创建了比必要更多的列,但我很难找到一种方法来获得独特的持仓信号,然后将价格与我收到卖出信号时的价格进行比较.如果有人可以推荐一种更简洁的方法来执行此操作,我很乐意放弃我的第一次尝试并试一试。
虽然您通常希望避免遍历数据帧的行,因为它非常缓慢且效率低下,但我发现这通常是回测时的最佳方法。由于您的头寸和投资组合价值取决于 T-1 值以计算 T 值,因此通常需要逐行计算,而且要简单得多。
import pandas as pd
data = {'date': ['1/3/2000','1/4/2000','1/5/2000','1/6/2000','1/7/2000','1/10/2000',
'1/11/2000','1/12/2000','1/13/2000','1/14/2000','1/18/2000','1/19/2000','1/20/2000','1/21/2000',
'1/24/2000','1/25/2000','1/26/2000','1/27/2000','1/28/2000','1/31/2000','2/1/2000','2/2/2000',
'2/3/2000','2/4/2000','2/7/2000','2/8/2000','2/9/2000','2/10/2000','2/11/2000','2/14/2000',
'2/15/2000','2/16/2000','2/17/2000','2/18/2000','2/22/2000','2/23/2000','2/24/2000','2/25/2000',
'2/28/2000','2/29/2000'],
'close': [308.3,315.3,314.4,307.5,309.8,313.4,310.7,324.2,332.5,348.8,351.1,348.2,348.7,343.5,343,343.3,342.4,343,
334.4,334.6,336,333.8,331.6,332.8,335.9,341.2,338.4,342.1,343.2,339.5,346.9,342,339.6,337.4,335,330.8,331.3,
331.1,332.6,335.1]}
df = pd.DataFrame(data)
df = df.set_index(['date'])
df['pos'] = 0
base_buy = 999999.0
base_sell = 0.0
for i in range(2, df.shape[0] - 1):
px_m1 = df.iloc[i - 1].loc['close']
px = df.iloc[i].loc['close']
px_p1 = df.iloc[i + 1].loc['close']
pos = df.iloc[i - 1].loc['pos']
#base_buy
if px > px_m1 and px > px_p1 and pos == 0:
base_buy = px
#entry signal
if px > base_buy and pos == 0:
pos = 1.0
base_sell = 0.0
#base_sell
if px < px_m1 and px < px_p1 and pos == 1:
base_sell = px
#exit signal
if px < base_sell and pos == 1.0:
pos = 0.0
base_buy = 999999.0
df.iloc[i, 1] = pos
print(df)
输出:
close pos
date
1/3/2000 308.3 0.0
1/4/2000 315.3 0.0
1/5/2000 314.4 0.0
1/6/2000 307.5 0.0
1/7/2000 309.8 0.0
1/10/2000 313.4 0.0
1/11/2000 310.7 0.0
1/12/2000 324.2 1.0
1/13/2000 332.5 1.0
1/14/2000 348.8 1.0
1/18/2000 351.1 1.0
1/19/2000 348.2 1.0
1/20/2000 348.7 1.0
1/21/2000 343.5 0.0
1/24/2000 343.0 0.0
1/25/2000 343.3 0.0