pandas groupby 函数仅返回空值作为输出
pandas groupby function is returning only null values as output
maindata['avg_delay']= maindata.groupby('name_customer')['Delay'].mean(numeric_only=False)
maindata.avg_delay
output:
0 NaT
1 NaT
2 NaT
4 NaT
5 NaT
..
49994 NaT
49996 NaT
49997 NaT
49998 NaT
49999 NaT
Name: avg_delay, Length: 40000, dtype: timedelta64[ns]
maindata.groupby('name_customer')['Delay'].mean(numeric_only=False)
给你一个 pd.Series
值 'name_customer'
作为系列的索引。请注意,当您将 pd.Series
分配给数据框的列时,分配为 index-by-index。因为你的 maindata
的索引不是 'name_customer'
的值,所以两组索引不匹配,因此你观察到的结果。
正如 doc 所说,transform
:
Call function producing a like-indexed DataFrame on each group and return a DataFrame having the same indexes as the original object filled with the transformed values
因此您可以改用以下行,但请检查结果是否符合您的需要。
maindata.groupby('name_customer')['Delay'].transform('mean')
maindata['avg_delay']= maindata.groupby('name_customer')['Delay'].mean(numeric_only=False)
maindata.avg_delay
output:
0 NaT
1 NaT
2 NaT
4 NaT
5 NaT
..
49994 NaT
49996 NaT
49997 NaT
49998 NaT
49999 NaT
Name: avg_delay, Length: 40000, dtype: timedelta64[ns]
maindata.groupby('name_customer')['Delay'].mean(numeric_only=False)
给你一个 pd.Series
值 'name_customer'
作为系列的索引。请注意,当您将 pd.Series
分配给数据框的列时,分配为 index-by-index。因为你的 maindata
的索引不是 'name_customer'
的值,所以两组索引不匹配,因此你观察到的结果。
正如 doc 所说,transform
:
Call function producing a like-indexed DataFrame on each group and return a DataFrame having the same indexes as the original object filled with the transformed values
因此您可以改用以下行,但请检查结果是否符合您的需要。
maindata.groupby('name_customer')['Delay'].transform('mean')