按分钟分组索引并计算平均值
Group index by minute and compute average
所以我有一个名为 'df' 的 pandas 数据框,我想删除秒数,只使用 YYYY-MM-DD HH:MM 格式的索引。但随后也会对分钟进行分组,并显示该分钟的平均值。
所以我想转这个dataFrame
value
2015-05-03 00:00:00 61.0
2015-05-03 00:00:10 60.0
2015-05-03 00:00:25 60.0
2015-05-03 00:00:30 61.0
2015-05-03 00:00:45 61.0
2015-05-03 00:01:00 61.0
2015-05-03 00:01:10 60.0
2015-05-03 00:01:25 60.0
2015-05-03 00:01:30 61.0
2015-05-03 00:01:45 61.0
2015-05-03 00:02:00 61.0
2015-05-03 00:02:10 60.0
2015-05-03 00:02:25 60.0
2015-05-03 00:02:40 60.0
2015-05-03 00:02:55 60.0
2015-05-03 00:03:00 59.0
2015-05-03 00:03:15 59.0
2015-05-03 00:03:20 59.0
2015-05-03 00:03:35 59.0
2015-05-03 00:03:40 60.0
进入这个数据框
value
2015-05-03 00:00 60.6
2015-05-03 00:01 60.6
2015-05-03 00:02 60.2
2015-05-03 00:03 59.2
我试过像
这样的代码
df['value'].resample('1Min').mean()
或
df.index.resample('1Min').mean()
但这似乎不起作用。有什么想法吗?
您需要先将索引转换为 DatetimeIndex
:
df.index = pd.DatetimeIndex(df.index)
#another solution
#df.index = pd.to_datetime(df.index)
print (df['value'].resample('1Min').mean())
#another same solution
#print (df.resample('1Min')['value'].mean())
2015-05-03 00:00:00 60.6
2015-05-03 00:01:00 60.6
2015-05-03 00:02:00 60.2
2015-05-03 00:03:00 59.2
Freq: T, Name: value, dtype: float64
另一种解决方案,通过astype
将索引中的秒数设置为0
:
print (df.groupby([df.index.values.astype('<M8[m]')])['value'].mean())
2015-05-03 00:00:00 60.6
2015-05-03 00:01:00 60.6
2015-05-03 00:02:00 60.2
2015-05-03 00:03:00 59.2
Name: value, dtype: float64
所以我有一个名为 'df' 的 pandas 数据框,我想删除秒数,只使用 YYYY-MM-DD HH:MM 格式的索引。但随后也会对分钟进行分组,并显示该分钟的平均值。
所以我想转这个dataFrame
value
2015-05-03 00:00:00 61.0
2015-05-03 00:00:10 60.0
2015-05-03 00:00:25 60.0
2015-05-03 00:00:30 61.0
2015-05-03 00:00:45 61.0
2015-05-03 00:01:00 61.0
2015-05-03 00:01:10 60.0
2015-05-03 00:01:25 60.0
2015-05-03 00:01:30 61.0
2015-05-03 00:01:45 61.0
2015-05-03 00:02:00 61.0
2015-05-03 00:02:10 60.0
2015-05-03 00:02:25 60.0
2015-05-03 00:02:40 60.0
2015-05-03 00:02:55 60.0
2015-05-03 00:03:00 59.0
2015-05-03 00:03:15 59.0
2015-05-03 00:03:20 59.0
2015-05-03 00:03:35 59.0
2015-05-03 00:03:40 60.0
进入这个数据框
value
2015-05-03 00:00 60.6
2015-05-03 00:01 60.6
2015-05-03 00:02 60.2
2015-05-03 00:03 59.2
我试过像
这样的代码df['value'].resample('1Min').mean()
或
df.index.resample('1Min').mean()
但这似乎不起作用。有什么想法吗?
您需要先将索引转换为 DatetimeIndex
:
df.index = pd.DatetimeIndex(df.index)
#another solution
#df.index = pd.to_datetime(df.index)
print (df['value'].resample('1Min').mean())
#another same solution
#print (df.resample('1Min')['value'].mean())
2015-05-03 00:00:00 60.6
2015-05-03 00:01:00 60.6
2015-05-03 00:02:00 60.2
2015-05-03 00:03:00 59.2
Freq: T, Name: value, dtype: float64
另一种解决方案,通过astype
将索引中的秒数设置为0
:
print (df.groupby([df.index.values.astype('<M8[m]')])['value'].mean())
2015-05-03 00:00:00 60.6
2015-05-03 00:01:00 60.6
2015-05-03 00:02:00 60.2
2015-05-03 00:03:00 59.2
Name: value, dtype: float64