添加日期之间有差异的列 pandas DataFrame

Add column with difference between dates pandas DataFrame

我有这样的DataFrame:

season       Date          Holiday_Name  
12-13        11/1/12          NaN        
12-13        11/2/12          Nan        
12-13        3/31/13         Easter        
12-13         4/5/13           NaN           

13-14        11/1/13          NaN.  
13-14        4/18/14          Nan.   
13-14        4/20/14         Easter.  
13-14        4/22/14          Nan.   

等等...

我需要的是一个新专栏,其中每个季节都与复活节有天数差异。

我已经尝试过 groupby、for 循环(即使我知道这是错误的)、where 方法,似乎没有任何效果。

dataset["difference"] = dataset["Date"] -dataset["Date"].where(dataset["holiday_name"]=="Easter").days

但它给了我这个错误:

'Series' object has no attribute 'days'

dataset['differenza_pasqua'] = pd.Index(dataset["Data"] -dataset["Data"].where(dataset["holiday_name"]=="Pasqua di Resurrezione").dropna()).days

有了这个我可以在复活节那天设置为 0,但其他的被标记为 NaN。

我期望的是这样的:

season       Date          Holiday_Name      difference  
12-13        11/1/12          NaN               150    
12-13        11/2/12          NaN               149.  
12-13        3/31/13         Easter              0.  
12-13        4/5/13           NaN                5.  

13-14        11/1/13          NaN               150.  
13-14        4/18/14          Nan                 2.  
13-14        4/20/14         Easter               0.   
13-14        4/22/14          Nan                 2.   

感谢您的帮助。

使用groupby很容易解决。

ddf = df.groupby('season').apply(lambda x : x['Date'] - x.loc[x['Holiday_Name'] == 'Easter']['Date'].iloc[0]).reset_index()
df['difference'] = ddf['Date']

  season       Date Holiday_Name difference
0  12-13 2012-11-01          NaN  -150 days
1  12-13 2012-11-02          Nan  -149 days
2  12-13 2013-03-31       Easter     0 days
3  12-13 2013-04-05          NaN     5 days
4  13-14 2013-11-01          NaN  -170 days
5  13-14 2014-04-18          Nan    -2 days
6  13-14 2014-04-20       Easter     0 days
7  13-14 2014-04-22          Nan     2 days

注意:您需要从 "Nan. Easter."

中的数据中删除点