将每月 return 的 pandas 时间序列转换为每年累积 col 的 df return

Question

我的 df 包含每月 returns 看起来像这样：

df=pd.DataFrame((x*x).dropna(),columns=['mthly rtrn']) 

             mthly rtrn
2016-09-30    0.002488
2016-10-31   -0.004692
2016-11-30    0.003157
2016-12-30   -0.000503
2017-01-31    0.008019
2017-02-28    0.010055
2017-03-31    0.003435
2017-04-28    0.002577
2017-05-31    0.012107
2017-06-30    0.001089

我如何将其转换为包含 Jan 到 Dec 列以及每年累积列的 df return。线条应该是 2016 年、2017 年等。理想情况下，数字应该以 % 显示。

期望的输出：

      Jan   Feb  Mar  Apr  May  Jun  Jul  Aug  Sep   Oct  Nov  Dec ANNUAL
2016 -5.0  -0.1  6.7  0.4  1.7  0.3  3.6  0.1  0.0  -1.7  3.7  2.0   12.0
2017  1.8   3.9  0.1  1.0  1.4  0.6  0.1   NA   NA    NA   NA   NA    9.3

其中 ANNUAL 是每月 returns 的 cumprod。

实现此目标的最佳 pythonic 方法是什么？

Answer 1

我会首先使用 this, then use a pivot table 添加额外的列 Year 和 Month 到你的 df 以创建一个新的 df 来获得索引为年，列为月，围绕每月return值。

获得数据透视表 table 后，您可以沿轴 =1 使用 apply 来获取每年所需的任何聚合。

我无法真正评论聚合，因为我不确定 'cumulative' 你是指加法还是乘法。您可能想考虑 cumsum or , or if you prefer not to get scipy this 函数也可以。

Answer 2

我将首先使用 .resample() 方法按月对数据重新采样：

http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.core.groupby.DataFrameGroupBy.resample.html

然后使用数据透视法将行变成列：

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.pivot_table.html

然后我会为年度总计创建一个新列：

df['annual'] = df['jan']+df['feb']+...+df['dec']

Answer 3

我找到了一个很好的工具来满足我的需要： https://github.com/ranaroussi/monthly-returns-heatmap

将每月 return 的 pandas 时间序列转换为每年累积 col 的 df return

Convert pandas time series with monthly returns into a df with col cumulative yearly return

python

pandas

python-3.6