按 python 中的年份分组数据

Question

我有一个 csv 文件，其中包含从 2006/01/01 到 2011/01/01 的数据，它包含：（timestapm、heure、lat、lon、impact）我需要计算每个影响的平均值年复一年，然后绘制它。我相信我应该每天对数据进行分组，然后按月分组，然后按年分组。

这是我的数据示例：

 timestamp,heure,lat,lon,impact,type
 2007-01-01 00:00:00,13:58:43,33.837,-9.205,10.3,1
 2007-01-02 00:00:00,00:07:28,34.5293,-10.2384,17.7,1
 2007-01-02 00:00:00,23:01:03,35.0617,-1.435,-17.1,2
 2007-01-03 00:00:00,01:14:29,36.5685,0.9043,36.8,1
 2007-01-03 00:00:00,05:03:51,34.1919,-12.5061,-48.9,1

这是我正在使用的代码：

names =["timestamp","heure","lat","lon","impact","type"]
data = pd.read_csv('flash.txt', names=names, parse_dates=['timestamp'], index_col=['timestamp'])
print (data.head())
daily = data.groupby(pd.TimeGrouper(freq='D'))['impact'].count()
monthly = daily.groupby(pd.TimeGrouper(freq='M'))['impact'].count()
ax = yearly.plot(kind='bar')
plt.show()

这是我得到的结果：

SO，我的主要要求是如何按年份分组，以便（文件中第一年的一月到文件中的去年一月聚集在一个柱中）等等所有月份。有什么想法吗？

Answer 1

你可以做一个groupby组合：（假设你有时间日期时间的索引）

data.groupby([(data.index.year),(data.index.month)])['impact'].count()

这将按年和按月分组

Answer 2

另一种方法：

data.groupby(lambda x: (x.year, x.month)).size()

同样：

df.groupby([lambda x: x.year, lambda x: x.month]).size()

按 python 中的年份分组数据

grouping data by year in python

python-2.7

pandas

pandas-groupby