如何使用多个 groupby 列从 OHLC 数据计算数据透视值
How to calculate pivot value from OHLC data with multiple groupby column
我有一个 pandas 数据集,其中包含开盘价、最高价、最低价、收盘价、key1 和 key2 列。现在我想按 key1 和 key2 对数据集进行分组,并使用公式 - (high + low + close) / 3 计算 pivot。至此我可以做到。但要求是将计算的数据转移到我无法编码的下一组。
我能够按 key1 和 key2 列对数据集进行分组,并能够通过以下代码计算数据透视数据,但无法在下一组中移动值。
import pandas as pd
data = pd.DataFrame([[110, 115, 105, 111, 1, 2],[11, 16, 6, 12, 1, 2],[12, 17, 7, 13, 1, 3],[22, 25, 17, 20, 1, 3],[12, 16, 6, 11, 2, 4],[32, 36, 26, 28, 2, 4],[9, 13, 4, 13, 2, 5],[49, 53, 40, 45, 2, 5],[13, 18, 9, 12, 3, 6],[14, 16, 10, 13, 3, 6]], columns=["open","high","low","close","key1", "key2"])
s = (data.high.groupby([data.key1, data.key2]).max() + data.low.groupby([data.key1, data.key2]).min() + data.close.groupby([data.key1, data.key2]).last()) / 3
#data['pivot'] = data['key1', 'key2'].map(s.shift())
print(data)
当我使用下面的代码时,
import pandas as pd
data = pd.DataFrame([[110, 115, 105, 111, 1, 2],[11, 16, 6, 12, 1, 2],[12, 17, 7, 13, 1, 3],[22, 25, 17, 20, 1, 3],[12, 16, 6, 11, 2, 4],[32, 36, 26, 28, 2, 4],[9, 13, 4, 13, 2, 5],[49, 53, 40, 45, 2, 5],[13, 18, 9, 12, 3, 6],[14, 16, 10, 13, 3, 6]], columns=["open","high","low","close","key1", "key2"])
data['pivot'] = (data.high.groupby([data.key1, data.key2]).transform('max') + data.low.groupby([data.key1, data.key2]).transform('min') + data.close.groupby([data.key1, data.key2]).transform('last')) / 3
print(data)
我低于输出。
open high low close key1 key2 pivot
0 110 115 105 111 1 2 44.333333
1 11 16 6 12 1 2 44.333333
2 12 17 7 13 1 3 17.333333
3 22 25 17 20 1 3 17.333333
4 12 16 6 11 2 4 23.333333
5 32 36 26 28 2 4 23.333333
6 9 13 4 13 2 5 34.000000
7 49 53 40 45 2 5 34.000000
8 13 18 9 12 3 6 13.333333
9 14 16 10 13 3 6 13.333333
但预期输出:
open high low close key1 key2 pivot
0 110 115 105 111 1 2 NaN
1 11 16 6 12 1 2 NaN
2 12 17 7 13 1 3 44.333333
3 22 25 17 20 1 3 44.333333
4 12 16 6 11 2 4 17.333333
5 32 36 26 28 2 4 17.333333
6 9 13 4 13 2 5 23.333333
7 49 53 40 45 2 5 23.333333
8 13 18 9 12 3 6 34.000000
9 14 16 10 13 3 6 34.000000
首先将聚合函数与字典一起使用,然后 GroupBy.agg
and then for new column DataFrame.join
与 shift
:
s = data.groupby(['key1','key2']).agg({'low':'min','high':'max','close':'last'}).sum(axis=1)/3
data = data.join(s.rename('pivot').shift(), on=['key1','key2'])
print (data)
open high low close key1 key2 pivot
0 110 115 105 111 1 2 NaN
1 11 16 6 12 1 2 NaN
2 12 17 7 13 1 3 44.333333
3 22 25 17 20 1 3 44.333333
4 12 16 6 11 2 4 17.333333
5 32 36 26 28 2 4 17.333333
6 9 13 4 13 2 5 23.333333
7 49 53 40 45 2 5 23.333333
8 13 18 9 12 3 6 34.000000
9 14 16 10 13 3 6 34.000000
我有一个 pandas 数据集,其中包含开盘价、最高价、最低价、收盘价、key1 和 key2 列。现在我想按 key1 和 key2 对数据集进行分组,并使用公式 - (high + low + close) / 3 计算 pivot。至此我可以做到。但要求是将计算的数据转移到我无法编码的下一组。
我能够按 key1 和 key2 列对数据集进行分组,并能够通过以下代码计算数据透视数据,但无法在下一组中移动值。
import pandas as pd
data = pd.DataFrame([[110, 115, 105, 111, 1, 2],[11, 16, 6, 12, 1, 2],[12, 17, 7, 13, 1, 3],[22, 25, 17, 20, 1, 3],[12, 16, 6, 11, 2, 4],[32, 36, 26, 28, 2, 4],[9, 13, 4, 13, 2, 5],[49, 53, 40, 45, 2, 5],[13, 18, 9, 12, 3, 6],[14, 16, 10, 13, 3, 6]], columns=["open","high","low","close","key1", "key2"])
s = (data.high.groupby([data.key1, data.key2]).max() + data.low.groupby([data.key1, data.key2]).min() + data.close.groupby([data.key1, data.key2]).last()) / 3
#data['pivot'] = data['key1', 'key2'].map(s.shift())
print(data)
当我使用下面的代码时,
import pandas as pd
data = pd.DataFrame([[110, 115, 105, 111, 1, 2],[11, 16, 6, 12, 1, 2],[12, 17, 7, 13, 1, 3],[22, 25, 17, 20, 1, 3],[12, 16, 6, 11, 2, 4],[32, 36, 26, 28, 2, 4],[9, 13, 4, 13, 2, 5],[49, 53, 40, 45, 2, 5],[13, 18, 9, 12, 3, 6],[14, 16, 10, 13, 3, 6]], columns=["open","high","low","close","key1", "key2"])
data['pivot'] = (data.high.groupby([data.key1, data.key2]).transform('max') + data.low.groupby([data.key1, data.key2]).transform('min') + data.close.groupby([data.key1, data.key2]).transform('last')) / 3
print(data)
我低于输出。
open high low close key1 key2 pivot
0 110 115 105 111 1 2 44.333333
1 11 16 6 12 1 2 44.333333
2 12 17 7 13 1 3 17.333333
3 22 25 17 20 1 3 17.333333
4 12 16 6 11 2 4 23.333333
5 32 36 26 28 2 4 23.333333
6 9 13 4 13 2 5 34.000000
7 49 53 40 45 2 5 34.000000
8 13 18 9 12 3 6 13.333333
9 14 16 10 13 3 6 13.333333
但预期输出:
open high low close key1 key2 pivot
0 110 115 105 111 1 2 NaN
1 11 16 6 12 1 2 NaN
2 12 17 7 13 1 3 44.333333
3 22 25 17 20 1 3 44.333333
4 12 16 6 11 2 4 17.333333
5 32 36 26 28 2 4 17.333333
6 9 13 4 13 2 5 23.333333
7 49 53 40 45 2 5 23.333333
8 13 18 9 12 3 6 34.000000
9 14 16 10 13 3 6 34.000000
首先将聚合函数与字典一起使用,然后 GroupBy.agg
and then for new column DataFrame.join
与 shift
:
s = data.groupby(['key1','key2']).agg({'low':'min','high':'max','close':'last'}).sum(axis=1)/3
data = data.join(s.rename('pivot').shift(), on=['key1','key2'])
print (data)
open high low close key1 key2 pivot
0 110 115 105 111 1 2 NaN
1 11 16 6 12 1 2 NaN
2 12 17 7 13 1 3 44.333333
3 22 25 17 20 1 3 44.333333
4 12 16 6 11 2 4 17.333333
5 32 36 26 28 2 4 17.333333
6 9 13 4 13 2 5 23.333333
7 49 53 40 45 2 5 23.333333
8 13 18 9 12 3 6 34.000000
9 14 16 10 13 3 6 34.000000