rolling.apply 关于需要多列数据框以减少单列的自定义函数

rolling.apply on custom function that requires multiple columns of dataframe to reduce single column

我正在尝试使用自定义函数在 df['cond'] 上通过 rolling.apply 创建一个附加列 df['newc']。自定义函数需要两列df。我不确定如何让它工作。

我试过了

df['newc'] = df['cond'].rolling(4).apply(T_correction, 
args = (df['temp'].rolling(4)))

这显然不起作用,并出现以下错误:

raise NotImplementedError('See issue #11704 {url}'.format(url=url))
NotImplementedError: See issue #11704 https://github.com/pandas-dev/pandas/issues/11704

可能rolling.apply在这里不合适。寻找有关替代解决方案的建议。

>>> df.head()
                       temp   cond
ts
2018-06-01 00:00:00  51.908  27.83
2018-06-01 00:05:00  52.144  27.83
2018-06-01 00:10:00  51.880  27.83
2018-06-01 00:15:00  52.001  27.83
2018-06-01 00:20:00  51.835  27.83

def T_correction(df, d):
    df = pd.DataFrame(data = df)
    df.columns = ['cond']
    df['temp'] = d
    X = df.drop(['cond'], axis = 1)    # X features: temp

    X = sm.add_constant(X)             # add intercept
    lmodel = sm.OLS(df.cond, X)        # fit cond = a + b*temp
    results = lmodel.fit()             #
    Op = results.predict(X)            # derive 'cond' as explained by temp
    Tc1 = df.cond - Op                 # remove the linear influence

#---conditional correction --------------------------------------
    Tc = np.where(df.temp > (np.mean(df.temp) + 0.5*np.std(df.temp)), df.cond, Tc1)
    return Tc[-1]     # returning the last value

预期结果:

>>> df.head()
                       temp   cond   newc
ts
2018-06-01 00:00:00  51.908  27.83   NaN
2018-06-01 00:05:00  52.144  27.83   NaN
2018-06-01 00:10:00  51.880  27.83   NaN
2018-06-01 00:15:00  52.001  27.83   26.00
2018-06-01 00:20:00  51.835  27.83   25.00

该功能目前似乎不可用。 pandas github 上有一个关于此主题的未解决问题。请检查:https://github.com/pandas-dev/pandas/issues/15095.