如何在 pandas.apply 函数中索引和使用当前和先前的列值来计算下一列值?
How do you index and use current and previous column values to calculate the next column value in a pandas.apply function?
我正在尝试构建一个函数,该函数使用 pandas 数据帧中的收盘价和 ATR 值计算追踪止损值。
供参考的dataframe如下:
High Low Open Close ATR
Date
2020-06-01 5.88 5.67 5.73 5.87 0.210000
2020-06-02 6.00 5.83 5.96 5.90 0.207143
2020-06-03 6.27 5.92 5.99 6.19 0.218776
2020-06-04 6.58 6.12 6.20 6.57 0.236006
2020-06-05 7.50 7.02 7.24 7.34 0.285577
2020-06-08 7.74 7.37 7.53 7.53 0.293750
2020-06-09 7.44 7.05 7.22 7.24 0.307053
2020-06-10 7.34 6.77 7.33 6.81 0.325835
2020-06-11 6.46 6.04 6.07 6.13 0.357561
我想要的样子:
High Low Open Close ATR ATR_TS
Date
2020-06-01 5.88 5.67 5.73 5.87 0.210000 5.135000
2020-06-02 6.00 5.83 5.96 5.90 0.207143 5.175000
2020-06-03 6.27 5.92 5.99 6.19 0.218776 5.424286
2020-06-04 6.58 6.12 6.20 6.57 0.236006 5.743980
2020-06-05 7.50 7.02 7.24 7.34 0.285577 6.340481
2020-06-08 7.74 7.37 7.53 7.53 0.293750 6.501876
2020-06-09 7.44 7.05 7.22 7.24 0.307053 6.501876
2020-06-10 7.34 6.77 7.33 6.81 0.325835 6.501876
2020-06-11 6.46 6.04 6.07 6.13 0.357561 7.381464
我的 pseudo-function/logic 目前是这样的:
def atr_ts(close, atr):
bigatr = atr*3.5
buysell = 1
stop[i-1] = 0
if buysell > 0:
stop = close - bigatr
stop = max(stop, stop[i-1])
if close < stop:
stop = close + bigatr
buysell = -1
elif buysell < 0:
stop = close + bigatr
stop = min(stop, stop[i-1])
if close > stop:
stop = close - bigatr
buysell = 1
return stop
df['ATR_TS'] = df.apply(lambda col: atr_ts(col['Close'], col['ATR']), axis = 1)
所以我的问题是,如何索引这个函数计算的前一个停止值(ATR_TS)来计算下一个停止值,第一个停止值是0?
如果有人在不使用 pandas.apply 的情况下看到更好的解决此问题的方法,也请分享。
总体而言,我是编程新手,所以如果不清楚,我深表歉意。
非常感谢。
这不是最终的解决方案,因为我很困惑为什么 ATR_TS
= 7.381464
的最后一个值,尽管我知道您是如何计算该值的。我创建了许多列来可视化使用 .shift()
和 .cumsum()
进行某些计算的“pandonic”方式进行 row-wise 比较。请查看列和我的屏幕截图并解释如何获取最后一个值,但在最终解决方案中绝对不需要所有这些列:
df['bigatr'] = (df['ATR'] * 3.5)
df['Stop1a'] = df['Close'] - (df['ATR'] * 3.5)
df['Stop2a'] = df.shift()['Close'] - (df.shift()['ATR'] * 3.5)
df['Stop3a'] = df[['Stop1a','Stop2a']].max(axis=1)
df['Stop1b'] = df['Close'] + (df['ATR'] * 3.5)
df['Stop2b'] = df.shift()['Close'] + (df.shift()['ATR'] * 3.5)
df['Stop3b'] = df[['Stop1b','Stop2b']].min(axis=1)
df['cuma'] = (df['Stop1a'] > df.shift()['Stop1a']).cumsum()
df['cumb'] = (df['Stop1b'] < df.shift()['Stop1b']).cumsum()
df['ATR_TSa'] = df.groupby((df['Stop1a'] > df.shift()['Stop1a']).cumsum())['Stop1a'].transform('first')
df['ATR_TSb'] = df.groupby((df['Stop1b'] < df.shift()['Stop1b']).cumsum())['Stop1b'].transform('first')
df
如您所见,最终的解决方案是最后一行red
中圈出的['ATR_TSa']
值和blue
中圈出的df['ATR_TSb']
值。
编辑 #1 - 根据 OP 的评论,解决上述问题的最终逻辑是添加:
df['ATR_TS'] = np.where((df['Close'] < df['ATR_TSa']), df['ATR_TSb'], df['ATR_TSa'])
现在,在下面,我将提供一个更简洁的解决方案:
df['Stop1a'] = df['Close'] - (df['ATR'] * 3.5)
df['Stop1b'] = df['Close'] + (df['ATR'] * 3.5)
a = df.groupby((df['Stop1a'] > df.shift()['Stop1a']).cumsum())['Stop1a'].transform('first')
b = df.groupby((df['Stop1b'] <= df.shift()['Stop1b']).cumsum())['Stop1b'].transform('first')
df['ATR_TS'] = np.where((df['Close'] < a), b, a)
df = df.drop(['Stop1a','Stop1b'], axis=1)
df
Out[1]:
Date High Low Open Close ATR ATR_TS
0 2020-06-01 5.88 5.67 5.73 5.87 0.210000 5.135000
1 2020-06-02 6.00 5.83 5.96 5.90 0.207143 5.175000
2 2020-06-03 6.27 5.92 5.99 6.19 0.218776 5.424284
3 2020-06-04 6.58 6.12 6.20 6.57 0.236006 5.743979
4 2020-06-05 7.50 7.02 7.24 7.34 0.285577 6.340480
5 2020-06-08 7.74 7.37 7.53 7.53 0.293750 6.501875
6 2020-06-09 7.44 7.05 7.22 7.24 0.307053 6.501875
7 2020-06-10 7.34 6.77 7.33 6.81 0.325835 6.501875
8 2020-06-11 6.46 6.04 6.07 6.13 0.357561 7.381463
我正在尝试构建一个函数,该函数使用 pandas 数据帧中的收盘价和 ATR 值计算追踪止损值。
供参考的dataframe如下:
High Low Open Close ATR
Date
2020-06-01 5.88 5.67 5.73 5.87 0.210000
2020-06-02 6.00 5.83 5.96 5.90 0.207143
2020-06-03 6.27 5.92 5.99 6.19 0.218776
2020-06-04 6.58 6.12 6.20 6.57 0.236006
2020-06-05 7.50 7.02 7.24 7.34 0.285577
2020-06-08 7.74 7.37 7.53 7.53 0.293750
2020-06-09 7.44 7.05 7.22 7.24 0.307053
2020-06-10 7.34 6.77 7.33 6.81 0.325835
2020-06-11 6.46 6.04 6.07 6.13 0.357561
我想要的样子:
High Low Open Close ATR ATR_TS
Date
2020-06-01 5.88 5.67 5.73 5.87 0.210000 5.135000
2020-06-02 6.00 5.83 5.96 5.90 0.207143 5.175000
2020-06-03 6.27 5.92 5.99 6.19 0.218776 5.424286
2020-06-04 6.58 6.12 6.20 6.57 0.236006 5.743980
2020-06-05 7.50 7.02 7.24 7.34 0.285577 6.340481
2020-06-08 7.74 7.37 7.53 7.53 0.293750 6.501876
2020-06-09 7.44 7.05 7.22 7.24 0.307053 6.501876
2020-06-10 7.34 6.77 7.33 6.81 0.325835 6.501876
2020-06-11 6.46 6.04 6.07 6.13 0.357561 7.381464
我的 pseudo-function/logic 目前是这样的:
def atr_ts(close, atr):
bigatr = atr*3.5
buysell = 1
stop[i-1] = 0
if buysell > 0:
stop = close - bigatr
stop = max(stop, stop[i-1])
if close < stop:
stop = close + bigatr
buysell = -1
elif buysell < 0:
stop = close + bigatr
stop = min(stop, stop[i-1])
if close > stop:
stop = close - bigatr
buysell = 1
return stop
df['ATR_TS'] = df.apply(lambda col: atr_ts(col['Close'], col['ATR']), axis = 1)
所以我的问题是,如何索引这个函数计算的前一个停止值(ATR_TS)来计算下一个停止值,第一个停止值是0? 如果有人在不使用 pandas.apply 的情况下看到更好的解决此问题的方法,也请分享。
总体而言,我是编程新手,所以如果不清楚,我深表歉意。
非常感谢。
这不是最终的解决方案,因为我很困惑为什么 ATR_TS
= 7.381464
的最后一个值,尽管我知道您是如何计算该值的。我创建了许多列来可视化使用 .shift()
和 .cumsum()
进行某些计算的“pandonic”方式进行 row-wise 比较。请查看列和我的屏幕截图并解释如何获取最后一个值,但在最终解决方案中绝对不需要所有这些列:
df['bigatr'] = (df['ATR'] * 3.5)
df['Stop1a'] = df['Close'] - (df['ATR'] * 3.5)
df['Stop2a'] = df.shift()['Close'] - (df.shift()['ATR'] * 3.5)
df['Stop3a'] = df[['Stop1a','Stop2a']].max(axis=1)
df['Stop1b'] = df['Close'] + (df['ATR'] * 3.5)
df['Stop2b'] = df.shift()['Close'] + (df.shift()['ATR'] * 3.5)
df['Stop3b'] = df[['Stop1b','Stop2b']].min(axis=1)
df['cuma'] = (df['Stop1a'] > df.shift()['Stop1a']).cumsum()
df['cumb'] = (df['Stop1b'] < df.shift()['Stop1b']).cumsum()
df['ATR_TSa'] = df.groupby((df['Stop1a'] > df.shift()['Stop1a']).cumsum())['Stop1a'].transform('first')
df['ATR_TSb'] = df.groupby((df['Stop1b'] < df.shift()['Stop1b']).cumsum())['Stop1b'].transform('first')
df
如您所见,最终的解决方案是最后一行red
中圈出的['ATR_TSa']
值和blue
中圈出的df['ATR_TSb']
值。
编辑 #1 - 根据 OP 的评论,解决上述问题的最终逻辑是添加:
df['ATR_TS'] = np.where((df['Close'] < df['ATR_TSa']), df['ATR_TSb'], df['ATR_TSa'])
现在,在下面,我将提供一个更简洁的解决方案:
df['Stop1a'] = df['Close'] - (df['ATR'] * 3.5)
df['Stop1b'] = df['Close'] + (df['ATR'] * 3.5)
a = df.groupby((df['Stop1a'] > df.shift()['Stop1a']).cumsum())['Stop1a'].transform('first')
b = df.groupby((df['Stop1b'] <= df.shift()['Stop1b']).cumsum())['Stop1b'].transform('first')
df['ATR_TS'] = np.where((df['Close'] < a), b, a)
df = df.drop(['Stop1a','Stop1b'], axis=1)
df
Out[1]:
Date High Low Open Close ATR ATR_TS
0 2020-06-01 5.88 5.67 5.73 5.87 0.210000 5.135000
1 2020-06-02 6.00 5.83 5.96 5.90 0.207143 5.175000
2 2020-06-03 6.27 5.92 5.99 6.19 0.218776 5.424284
3 2020-06-04 6.58 6.12 6.20 6.57 0.236006 5.743979
4 2020-06-05 7.50 7.02 7.24 7.34 0.285577 6.340480
5 2020-06-08 7.74 7.37 7.53 7.53 0.293750 6.501875
6 2020-06-09 7.44 7.05 7.22 7.24 0.307053 6.501875
7 2020-06-10 7.34 6.77 7.33 6.81 0.325835 6.501875
8 2020-06-11 6.46 6.04 6.07 6.13 0.357561 7.381463