可变移动平均线

Variable Moving Average

我有一个如下所示的 DataFrame:

    a    b             
1   0.9  0.796522123    
2   0.8  0.701075019    
3   0.6  0.777130253    
4   0.5  0.209912906    
5   0.75 0.920537662    
6   1    0.955212665    
7   3.5  0.227221963    
8   2    0.336632891    
9   1.25 0.563511758    
10  1    0.832624112    

我想创建一个最大周期为 3 的移动平均线,其中每个观察值都是 df['a']*df['b]。

如果df['a'].rolling(window=3).sum() <= 3,则移动平均线为:

df['MA'] = (df['a']*df['b']).rolling(window=3).mean()

但是,例如,如果 df['a'].rolling(window=3).sum() > 3,就像 df[8:10] 的情况一样,那么我希望移动平均线是:

((1*0.83)+(1.25*0.832624112)+(0.75*0.336632891))/3.

我一直在尝试创建一个函数然后应用它,比如:

def MA(a, b, period):
    total = 0
    sum_a = 0
    for i in (b):
        if sum_a < period:
            sum_a += a
            total += (a*b)
        else:
            sum_a = sum_a - a
            diff = period - sum_a
            total = total + (diff*b)
     return total

df['MA'] = df.apply(lambda x: MA(a, b, period), axis=1)

我无法让它工作,我想有一个更简单的方法使用 pandasnumpy 来做到这一点。

非常感谢。

def rolling_mean(x):
    if (x.name <3):
        return np.nan
    a1 = x.a if x.a <=3 else 3
    #look at previous 2 rows and reduce the 'a' value if the sum of the 3 rows is greater than 3.
    a2 = df.loc[x.name-1].a if (a1+df.loc[x.name-1].a) <=3 else 3-a1
    a3 = df.loc[x.name-2].a if (a1+a2+df.loc[x.name-2].a) <=3 else 3-a1-a2
    #calculate mean of adjusted a * b for the current row and 2 rows above.
    return (a1*x.b + a2*df.loc[x.name-1].b + a3*df.loc[x.name-2].b)/3.0
#apply the rolling_mean to the dataframe to create a new column 'MA'
df['MA'] = df.apply(func=rolling_mean,axis=1)
df
Out[500]: 
       a         b        MA
1   0.90  0.796522       NaN
2   0.80  0.701075       NaN
3   0.60  0.777130  0.581336
4   0.50  0.209913  0.377365
5   0.75  0.920538  0.420546
6   1.00  0.955213  0.583524
7   3.50  0.227222  0.227222
8   2.00  0.336633  0.300163
9   1.25  0.563512  0.431166
10  1.00  0.832624  0.596496