如何创建一个 Python 函数来根据前一行的值创建值?
How do I create a Python function that creates values based on previous row's values?
我有一个包含 3 列的 Pandas 数据框:日期、Return 和在 Return 列中具有股票收益百分比的结果。日期设置为每日并重置索引。
数据框有数百行。我试图通过使用结果列中的先前值来填充行的当前值。我可以像这样手动执行此操作:
df["Result"].iloc[0] = 100 * (1 + df["Return"].iloc[0])
df["Result"].iloc[1] = df["Result"].iloc[0] * (1 + df["Return"].iloc[1])
问题是当我尝试将它变成一个函数时,我也尝试过使用 lambda 函数但没有结果。其他资源过度简化了示例。有人可以帮忙吗?
这是尝试失败的功能的几个不同迭代之一。
def result_calc(df, i):
df["Result"].iloc[i] = df["Result"].iloc[i-1] * (1 + df["Return"].iloc[i])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
\<ipython-input-410-c6a07bcb23a7\> in \<module\>
1 df1 = buy_hold_df.copy()
\----\> 2 df1.iloc\[1:\]\["Result"\] = df1.iloc\[1:\]\["Result"\].apply(result_calc(df1, 1))
\~\Anaconda3\envs\FFN\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, \*\*kwds)
4106 else:
4107 values = self.astype(object).\_values
\-\> 4108 mapped = lib.map_infer(values, f, convert=convert_dtype)
4109
4110 if len(mapped) and isinstance(mapped\[0\], Series):
pandas_libs\lib.pyx in pandas.\_libs.lib.map_infer()
TypeError: 'NoneType' object is not callable
IIUC:
# pseudo code
result[0] = 100 * (1 + return[0])
result[1] = result[0] * (1 + return[1]) = 100 * (1 + return[0]) * (1 + return[1])
...
result[n] = result[n-1] * (1 + return[n])
= 100 * (1 + return[0]) * (1 + return[1]) * ... * (1 + return[n])
= 100 * cumprod(1 + return)
所以:
>>> import pandas as pd
>>>
>>> df = pd.DataFrame(dict(
... Date=["2019-01-02", "2019-01-03", "2019-01-04", "2019-01-07", "2019-01-08"],
... Return=[.035039, .001354, .025693, .018128, .012625]
... ))
>>> df["Result"] = 100 * (df.Return + 1).cumprod()
>>> df
Date Return Result
0 2019-01-02 0.035039 103.503900
1 2019-01-03 0.001354 103.644044
2 2019-01-04 0.025693 106.306971
3 2019-01-07 0.018128 108.234103
4 2019-01-08 0.012625 109.600559
我有一个包含 3 列的 Pandas 数据框:日期、Return 和在 Return 列中具有股票收益百分比的结果。日期设置为每日并重置索引。
数据框有数百行。我试图通过使用结果列中的先前值来填充行的当前值。我可以像这样手动执行此操作:
df["Result"].iloc[0] = 100 * (1 + df["Return"].iloc[0])
df["Result"].iloc[1] = df["Result"].iloc[0] * (1 + df["Return"].iloc[1])
问题是当我尝试将它变成一个函数时,我也尝试过使用 lambda 函数但没有结果。其他资源过度简化了示例。有人可以帮忙吗?
这是尝试失败的功能的几个不同迭代之一。
def result_calc(df, i):
df["Result"].iloc[i] = df["Result"].iloc[i-1] * (1 + df["Return"].iloc[i])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
\<ipython-input-410-c6a07bcb23a7\> in \<module\>
1 df1 = buy_hold_df.copy()
\----\> 2 df1.iloc\[1:\]\["Result"\] = df1.iloc\[1:\]\["Result"\].apply(result_calc(df1, 1))
\~\Anaconda3\envs\FFN\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, \*\*kwds)
4106 else:
4107 values = self.astype(object).\_values
\-\> 4108 mapped = lib.map_infer(values, f, convert=convert_dtype)
4109
4110 if len(mapped) and isinstance(mapped\[0\], Series):
pandas_libs\lib.pyx in pandas.\_libs.lib.map_infer()
TypeError: 'NoneType' object is not callable
IIUC:
# pseudo code
result[0] = 100 * (1 + return[0])
result[1] = result[0] * (1 + return[1]) = 100 * (1 + return[0]) * (1 + return[1])
...
result[n] = result[n-1] * (1 + return[n])
= 100 * (1 + return[0]) * (1 + return[1]) * ... * (1 + return[n])
= 100 * cumprod(1 + return)
所以:
>>> import pandas as pd
>>>
>>> df = pd.DataFrame(dict(
... Date=["2019-01-02", "2019-01-03", "2019-01-04", "2019-01-07", "2019-01-08"],
... Return=[.035039, .001354, .025693, .018128, .012625]
... ))
>>> df["Result"] = 100 * (df.Return + 1).cumprod()
>>> df
Date Return Result
0 2019-01-02 0.035039 103.503900
1 2019-01-03 0.001354 103.644044
2 2019-01-04 0.025693 106.306971
3 2019-01-07 0.018128 108.234103
4 2019-01-08 0.012625 109.600559