根据数组中的信号执行计算
Perform calculations based on signals in array
我有两列 - 数组中的 'close' 列和 'signals' 列。我想根据 'signals' 列中的分类数据对 'close' 列中的数据执行计算。如果相同的信号连续出现(忽略NAN)则什么也不做,只在n+t索引处的'signals'数据与前面索引n处的'signals'数据相反时才执行计算。
这是一个基本的回测代码,用于证明我逻辑上提出的算法的能力。我知道可能需要 for 循环才能正确应用,但我不确定在尝试应用到数据的特定索引点时如何正确应用。
伪代码
for n in signals:
if signals == 1:
if 'signals' n+t == 1 maintain 'close' at n index point:
when 'signals' n+t == 2
return ['close'(n+t) - 'close'(n)] in 'calculations' at index n+t
这是我希望通过编程方法获得的输出。
close signals calculations
0 100 NAN NAN
1 105 1 NAN
2 110 NAN NAN
3 107 1 NAN
4 115 NAN NAN
5 120 2 15
感谢您的帮助,如果需要任何说明,请告诉我!
一种方式可能是:
- 使用
dropna
提取 "signals" 不为空的行
- 使用
shift
删除连续的重复项
- 设置输出列:如果信号 = 2,设置
close
差异,否则:设置 NaN
。我使用 np.where()
- 使用
join
将此列更新为输入数据框
这里是代码:
# Import modules
import pandas as pd
import numpy as np
# Build dataset
data = [[10, np.NaN, ],
[105, 1, ],
[110, np.NaN, ],
[107, 1, ],
[115, np.NaN, ],
[120, 2, ]]
df = pd.DataFrame(data, columns=["close", "signals"])
# Select rows where "signals" not null and remove duplicates
sub_df = df.dropna(subset=['signals'])
# Remove consecutive duplicates
sub_df = sub_df.loc[sub_df.signals.shift() != sub_df.signals]
# If signal == 2, set diff between close and previous close
# Else: set NaN
sub_df['output'] = np.where(sub_df.signals == 2, sub_df.close - sub_df.close.shift(), np.NaN)
print(sub_df)
# close signals output
# 1 105 1.0 NaN
# 5 120 2.0 15.0
# Update dataframe with the new column
print(df.join(sub_df['output']))
# close signals output
# 0 10 NaN NaN
# 1 105 1.0 NaN
# 2 110 NaN NaN
# 3 107 1.0 NaN
# 4 115 NaN NaN
# 5 120 2.0 15.0
我有两列 - 数组中的 'close' 列和 'signals' 列。我想根据 'signals' 列中的分类数据对 'close' 列中的数据执行计算。如果相同的信号连续出现(忽略NAN)则什么也不做,只在n+t索引处的'signals'数据与前面索引n处的'signals'数据相反时才执行计算。
这是一个基本的回测代码,用于证明我逻辑上提出的算法的能力。我知道可能需要 for 循环才能正确应用,但我不确定在尝试应用到数据的特定索引点时如何正确应用。
伪代码
for n in signals:
if signals == 1:
if 'signals' n+t == 1 maintain 'close' at n index point:
when 'signals' n+t == 2
return ['close'(n+t) - 'close'(n)] in 'calculations' at index n+t
这是我希望通过编程方法获得的输出。
close signals calculations
0 100 NAN NAN
1 105 1 NAN
2 110 NAN NAN
3 107 1 NAN
4 115 NAN NAN
5 120 2 15
感谢您的帮助,如果需要任何说明,请告诉我!
一种方式可能是:
- 使用
dropna
提取 "signals" 不为空的行
- 使用
shift
删除连续的重复项
- 设置输出列:如果信号 = 2,设置
close
差异,否则:设置NaN
。我使用np.where()
- 使用
join
将此列更新为输入数据框
这里是代码:
# Import modules
import pandas as pd
import numpy as np
# Build dataset
data = [[10, np.NaN, ],
[105, 1, ],
[110, np.NaN, ],
[107, 1, ],
[115, np.NaN, ],
[120, 2, ]]
df = pd.DataFrame(data, columns=["close", "signals"])
# Select rows where "signals" not null and remove duplicates
sub_df = df.dropna(subset=['signals'])
# Remove consecutive duplicates
sub_df = sub_df.loc[sub_df.signals.shift() != sub_df.signals]
# If signal == 2, set diff between close and previous close
# Else: set NaN
sub_df['output'] = np.where(sub_df.signals == 2, sub_df.close - sub_df.close.shift(), np.NaN)
print(sub_df)
# close signals output
# 1 105 1.0 NaN
# 5 120 2.0 15.0
# Update dataframe with the new column
print(df.join(sub_df['output']))
# close signals output
# 0 10 NaN NaN
# 1 105 1.0 NaN
# 2 110 NaN NaN
# 3 107 1.0 NaN
# 4 115 NaN NaN
# 5 120 2.0 15.0