spyder 中的 RSI 使用 excel 中的数据
RSI in spyder using data in excel
所以我有一个 excel 文件,其中包含特定股票的数据。
我的 excel 文件包含大约 2 个月的数据,它以 5 分钟为间隔监控开盘价、收盘价、最高价、最低价和交易量,所以我的文件中大约有 3000 行文件。
我想每天计算一只股票的 RSI(或 EMA,如果更容易的话),我正在做一个汇总 table 来收集每日数据,以便它转换我的 table 3000 + 行到只有大约 60 行的 table(每行代表一天)。
本质上,我想要某种代码按日期对 excel 数据进行排序,然后将 RSI 计算为当天的单个值。 RSI 由以下公式给出:100-(100/(1+RS)) 其中 RS = 上涨的平均收益 periods/average 下跌期的损失。
注意:我的 excel 使用 'Datetime' 所以每一行的 'Datetime' 看起来像 '2022-03-03 9:30-5:00' 和下一行将是“2022-03-03 9:35-5:00”等。因此代码只需要查看日期而忽略我猜的时间。
一些代码可能有助于理解我在寻找什么:
所以我在这里调用我的 excel 文件,我希望代码获取调用的 excel 文件,按日期对数据进行分组,然后使用公式 I 计算每一天的 RSI上面写了
dat = pd.read_csv('AMD_5m.csv',index_col='Datetime',parse_dates=['Datetime'],
date_parser=lambda x: pd.to_datetime(x, utc=True))
dates = backtest.get_dates(dat.index)
#create a summary table
cols = ['Num. Obs.', 'Num. Trade', 'PnL', 'Win. Ratio','RSI'] #add addtional fields if necessary
summary_table = pd.DataFrame(index = dates, columns=cols)
# loop backtest by dates
这是我用来填写摘要中其他栏的代码 table,我将把我的 SMA(简单移动平均线)函数放在下面。
for d in dates:
this_dat = dat.loc[dat.index.date==d]
#find the number of observations in date d
summary_table.loc[d]['Num. Obs.'] = this_dat.shape[0]
#get trading (i.e. position holding) signals
signals = backtest.SMA(this_dat['Close'].values, window=10)
#find the number of trades in date d
summary_table.loc[d]['Num. Trade'] = np.sum(np.diff(signals)==1)
#find PnLs for 100 shares
shares = 100
PnL = -shares*np.sum(this_dat['Close'].values[1:]*np.diff(signals))
if np.sum(np.diff(signals))>0:
#close position at market close
PnL += shares*this_dat['Close'].values[-1]
summary_table.loc[d]['PnL'] = PnL
#find the win ratio
ind_in = np.where(np.diff(signals)==1)[0]+1
ind_out = np.where(np.diff(signals)==-1)[0]+1
num_win = np.sum((this_dat['Close'].values[ind_out]-this_dat['Close'].values[ind_in])>0)
if summary_table.loc[d]['Num. Trade']!=0:
summary_table.loc[d]['Win. Ratio'] = 1. *num_win/summary_table.loc[d]['Num. Trade']
这是我计算简单移动平均线的函数。有人告诉我尝试将其调整为 RSI 或 EMA(指数移动平均线)。显然,为 EMA 调整它并不太麻烦,但我无法弄清楚。
def SMA(p,window=10,signal_type='buy only'):
#input price "p", look-back window "window",
#signal type = buy only (default) --gives long signals, sell only --gives sell signals, both --gives both long and short signals
#return a list of signals = 1 for long position and -1 for short position
signals = np.zeros(len(p))
if len(p)<window:
#no signal if no sufficient data
return signals
sma = list(np.zeros(window)+np.nan) #the first few prices does not give technical indicator values
sma += [np.average(p[k:k+window]) for k in np.arange(len(p)-window)]
for i in np.arange(len(p)-1):
if np.isnan(sma[i]):
continue #skip the open market time window
if sma[i]<p[i] and (signal_type=='buy only' or signal_type=='both'):
signals[i] = 1
elif sma[i]>p[i] and (signal_type=='sell only' or signal_type=='both'):
signals[i] = -1
return signals
对此我有两个解决方案。一种是遍历每组,然后将相关数据添加到summary_table
,另一种是计算整个系列并将RSI
列设置为
我首先重新创建了数据:
import yfinance
import pandas as pd
# initially created similar data through yfinance,
# then copied this to Excel and changed the Datetime column to match yours.
df = yfinance.download("AAPL", period="60d", interval="5m")
# copied it and read it as a dataframe
df = pd.read_clipboard(sep=r'\s{2,}', engine="python")
df.head()
# Datetime Open High Low Close Adj Close Volume
#0 2022-03-03 09:30-05:00 168.470001 168.910004 167.970001 168.199905 168.199905 5374241
#1 2022-03-03 09:35-05:00 168.199997 168.289993 167.550003 168.129898 168.129898 1936734
#2 2022-03-03 09:40-05:00 168.119995 168.250000 167.740005 167.770004 167.770004 1198687
#3 2022-03-03 09:45-05:00 167.770004 168.339996 167.589996 167.718094 167.718094 2128957
#4 2022-03-03 09:50-05:00 167.729996 167.970001 167.619995 167.710007 167.710007 968410
然后我格式化数据并创建 summary_table
:
df["date"] = pd.to_datetime(df["Datetime"].str[:16], format="%Y-%m-%d %H:%M").dt.date
# calculate percentage change from open and close of each row
df["gain"] = (df["Close"] / df["Open"]) - 1
# your summary table, slightly changing the index to use the dates above
cols = ['Num. Obs.', 'Num. Trade', 'PnL', 'Win. Ratio','RSI'] #add addtional fields if necessary
summary_table = pd.DataFrame(index=df["date"].unique(), columns=cols)
选项 1:
# loop through each group, calculate the average gain and loss, then RSI
for grp, data in df.groupby("date"):
# average gain for gain greater than 0
average_gain = data[data["gain"] > 0]["gain"].mean()
# average loss for gain less than 0
average_loss = data[data["gain"] < 0]["gain"].mean()
# add to relevant cell of summary_table
summary_table["RSI"].loc[grp] = 100 - (100 / (1 + (average_gain / average_loss)))
选项 2:
# define a function to apply in the groupby
def rsi_calc(series):
avg_gain = series[series > 0].mean()
avg_loss = series[series < 0].mean()
return 100 - (100 / (1 + (avg_gain / avg_loss)))
summary_table["RSI"] = df.groupby("date")["gain"].apply(lambda x: rsi_calc(x))
输出(每个相同):
summary_table.head()
# Num. Obs. Num. Trade PnL Win. Ratio RSI
#2022-03-03 NaN NaN NaN NaN -981.214015
#2022-03-04 NaN NaN NaN NaN 501.950956
#2022-03-07 NaN NaN NaN NaN -228.379066
#2022-03-08 NaN NaN NaN NaN -2304.451654
#2022-03-09 NaN NaN NaN NaN -689.824739
所以我有一个 excel 文件,其中包含特定股票的数据。
我的 excel 文件包含大约 2 个月的数据,它以 5 分钟为间隔监控开盘价、收盘价、最高价、最低价和交易量,所以我的文件中大约有 3000 行文件。
我想每天计算一只股票的 RSI(或 EMA,如果更容易的话),我正在做一个汇总 table 来收集每日数据,以便它转换我的 table 3000 + 行到只有大约 60 行的 table(每行代表一天)。
本质上,我想要某种代码按日期对 excel 数据进行排序,然后将 RSI 计算为当天的单个值。 RSI 由以下公式给出:100-(100/(1+RS)) 其中 RS = 上涨的平均收益 periods/average 下跌期的损失。
注意:我的 excel 使用 'Datetime' 所以每一行的 'Datetime' 看起来像 '2022-03-03 9:30-5:00' 和下一行将是“2022-03-03 9:35-5:00”等。因此代码只需要查看日期而忽略我猜的时间。
一些代码可能有助于理解我在寻找什么:
所以我在这里调用我的 excel 文件,我希望代码获取调用的 excel 文件,按日期对数据进行分组,然后使用公式 I 计算每一天的 RSI上面写了
dat = pd.read_csv('AMD_5m.csv',index_col='Datetime',parse_dates=['Datetime'],
date_parser=lambda x: pd.to_datetime(x, utc=True))
dates = backtest.get_dates(dat.index)
#create a summary table
cols = ['Num. Obs.', 'Num. Trade', 'PnL', 'Win. Ratio','RSI'] #add addtional fields if necessary
summary_table = pd.DataFrame(index = dates, columns=cols)
# loop backtest by dates
这是我用来填写摘要中其他栏的代码 table,我将把我的 SMA(简单移动平均线)函数放在下面。
for d in dates:
this_dat = dat.loc[dat.index.date==d]
#find the number of observations in date d
summary_table.loc[d]['Num. Obs.'] = this_dat.shape[0]
#get trading (i.e. position holding) signals
signals = backtest.SMA(this_dat['Close'].values, window=10)
#find the number of trades in date d
summary_table.loc[d]['Num. Trade'] = np.sum(np.diff(signals)==1)
#find PnLs for 100 shares
shares = 100
PnL = -shares*np.sum(this_dat['Close'].values[1:]*np.diff(signals))
if np.sum(np.diff(signals))>0:
#close position at market close
PnL += shares*this_dat['Close'].values[-1]
summary_table.loc[d]['PnL'] = PnL
#find the win ratio
ind_in = np.where(np.diff(signals)==1)[0]+1
ind_out = np.where(np.diff(signals)==-1)[0]+1
num_win = np.sum((this_dat['Close'].values[ind_out]-this_dat['Close'].values[ind_in])>0)
if summary_table.loc[d]['Num. Trade']!=0:
summary_table.loc[d]['Win. Ratio'] = 1. *num_win/summary_table.loc[d]['Num. Trade']
这是我计算简单移动平均线的函数。有人告诉我尝试将其调整为 RSI 或 EMA(指数移动平均线)。显然,为 EMA 调整它并不太麻烦,但我无法弄清楚。
def SMA(p,window=10,signal_type='buy only'):
#input price "p", look-back window "window",
#signal type = buy only (default) --gives long signals, sell only --gives sell signals, both --gives both long and short signals
#return a list of signals = 1 for long position and -1 for short position
signals = np.zeros(len(p))
if len(p)<window:
#no signal if no sufficient data
return signals
sma = list(np.zeros(window)+np.nan) #the first few prices does not give technical indicator values
sma += [np.average(p[k:k+window]) for k in np.arange(len(p)-window)]
for i in np.arange(len(p)-1):
if np.isnan(sma[i]):
continue #skip the open market time window
if sma[i]<p[i] and (signal_type=='buy only' or signal_type=='both'):
signals[i] = 1
elif sma[i]>p[i] and (signal_type=='sell only' or signal_type=='both'):
signals[i] = -1
return signals
对此我有两个解决方案。一种是遍历每组,然后将相关数据添加到summary_table
,另一种是计算整个系列并将RSI
列设置为
我首先重新创建了数据:
import yfinance
import pandas as pd
# initially created similar data through yfinance,
# then copied this to Excel and changed the Datetime column to match yours.
df = yfinance.download("AAPL", period="60d", interval="5m")
# copied it and read it as a dataframe
df = pd.read_clipboard(sep=r'\s{2,}', engine="python")
df.head()
# Datetime Open High Low Close Adj Close Volume
#0 2022-03-03 09:30-05:00 168.470001 168.910004 167.970001 168.199905 168.199905 5374241
#1 2022-03-03 09:35-05:00 168.199997 168.289993 167.550003 168.129898 168.129898 1936734
#2 2022-03-03 09:40-05:00 168.119995 168.250000 167.740005 167.770004 167.770004 1198687
#3 2022-03-03 09:45-05:00 167.770004 168.339996 167.589996 167.718094 167.718094 2128957
#4 2022-03-03 09:50-05:00 167.729996 167.970001 167.619995 167.710007 167.710007 968410
然后我格式化数据并创建 summary_table
:
df["date"] = pd.to_datetime(df["Datetime"].str[:16], format="%Y-%m-%d %H:%M").dt.date
# calculate percentage change from open and close of each row
df["gain"] = (df["Close"] / df["Open"]) - 1
# your summary table, slightly changing the index to use the dates above
cols = ['Num. Obs.', 'Num. Trade', 'PnL', 'Win. Ratio','RSI'] #add addtional fields if necessary
summary_table = pd.DataFrame(index=df["date"].unique(), columns=cols)
选项 1:
# loop through each group, calculate the average gain and loss, then RSI
for grp, data in df.groupby("date"):
# average gain for gain greater than 0
average_gain = data[data["gain"] > 0]["gain"].mean()
# average loss for gain less than 0
average_loss = data[data["gain"] < 0]["gain"].mean()
# add to relevant cell of summary_table
summary_table["RSI"].loc[grp] = 100 - (100 / (1 + (average_gain / average_loss)))
选项 2:
# define a function to apply in the groupby
def rsi_calc(series):
avg_gain = series[series > 0].mean()
avg_loss = series[series < 0].mean()
return 100 - (100 / (1 + (avg_gain / avg_loss)))
summary_table["RSI"] = df.groupby("date")["gain"].apply(lambda x: rsi_calc(x))
输出(每个相同):
summary_table.head()
# Num. Obs. Num. Trade PnL Win. Ratio RSI
#2022-03-03 NaN NaN NaN NaN -981.214015
#2022-03-04 NaN NaN NaN NaN 501.950956
#2022-03-07 NaN NaN NaN NaN -228.379066
#2022-03-08 NaN NaN NaN NaN -2304.451654
#2022-03-09 NaN NaN NaN NaN -689.824739