获取每组的滚动总和

Getting Rolling Sum per Group

我有这样一个数据框:

Product_ID    Quantity    Year    Quarter   
  1             100       2021      1          
  1             100       2021      2         
  1              50       2021      3          
  1             100       2021      4          
  1             100       2022      1         
  2             100       2021      1          
  2             100       2021      2          
  3             100       2021      1          
  3             100       2021      2         

我想根据 Product_ID.

获取过去三个月(不包括当前月份)的总和

因此我尝试了这个:

df['Qty_Sum_3qrts'] = (df.groupby('Product_ID'['Quantity'].shift(1,fill_value=0)
                         .rolling(3).sum().reset_index(0,drop=True)
                       )

# Shifting 1, because I want to exclude the current row. 
# Rolling 3, because I want to have the 3 'rows' before 
# Grouping by, because I want to have the calculation PER product 

我的代码失败了,因为它不仅计算每个产品,而且还会给我其他产品的数字(假设产品 2,第 1 季度:给我产品 1 的 3 行)。

我建议的结果:

Product_ID    Quantity    Year    Quarter   Qty_Sum_3qrts
  1             100       2021      1          0 # because we dont historical data for this id
  1             100       2021      2          100 # sum of last month of this product 
  1              50       2021      3          200 # sum of last 2 months of this product
  1             100       2021      4          250 # sum of last 3 months of this product
  1             100       2022      1          250 # sum of last 3 months of this product
  2             100       2021      1          0  # because we dont have hist data for this id
  2             100       2021      2          100 # sum of last month of this product
  3             100       2021      1          0   # etc
  3             100       2021      2          100  # etc 

您需要对每组应用滚动总和,为此您可以使用apply

df['Qty_Sum_3qrts'] = (df.groupby('Product_ID')['Quantity']
                         .apply(lambda s: s.shift(1,fill_value=0)
                                           .rolling(3, min_periods=1).sum())
                       )

输出:

   Product_ID  Quantity  Year  Quarter  Qty_Sum_3qrts
0           1       100  2021        1            0.0
1           1       100  2021        2          100.0
2           1        50  2021        3          200.0
3           1       100  2021        4          250.0
4           1       100  2022        1          250.0
5           2       100  2021        1            0.0
6           2       100  2021        2          100.0
7           3       100  2021        1            0.0
8           3       100  2021        2          100.0