Pandas 数据框重塑
Pandas dateframe reshape
我正在玩一些日期,如下所示。
song_id action_type ds
0 effb071415be51f11e845884e67c0f8c 1 14 days
1 f87ff481d85d2f95335ab602f38a7655 1 13 days
2 8a27d9a6c59628c991c154e8d93f412e 2 12 days
3 ecea5fe33e6817d09c395f2910479728 1 11 days
4 31a3d0420d89c9b121bb55dbdbbeda6b 1 13 days
5 096e604f7b152fad0246ae731ed8ca73 1 15 days
6 5a7d9d75b898cd1b19ef6941cc1ddccf 1 12 days
7 8a103bd3a3295fbf9b3c3bf7972db299 2 12 days
8 17e90c7b3b7ebbe4b47344fcfab2fa7a 1 11 days
9 8a27d9a6c59628c991c154e8d93f412e 2 13 days
我想知道如何重塑它并得到类似的东西:
song_id day1_type1 day1_type2 day1_type3 day2_type1 ......... dayn_typen
(songid) (count of type1 on day1) (Nan if no count)...... (count of typen on dayn)
现在我用
action.groupby(['song_id','ds','action_type']).action_type.sum()
并得到类似的东西:
song_id ds action_type
00088cb1e6d740fcd42bc8ff2673c805 3 days 1 1
4 days 1 1
13 days 2 2
27 days 1 1
41 days 1 1
2 2
42 days 1 1
67 days 2 2
68 days 1 1
75 days 2 2
0008de587f84d8c9491502c5a5c8b466 0 days 1 4
2 4
1 days 1 17
4 days 1 6
7 days 1 10
8 days 1 5
我怎样才能重建或重塑我得到的东西以获得我想要的东西?
提前致谢。
>>> (df.groupby(['song_id', 'ds', 'action_type'])
.action_type
.sum()
.unstack(['action_type', 'ds'])
.fillna(0)
.sortlevel(level=[0, 1], axis=1))
action_type 1 2
ds 11 days 12 days 13 days 14 days 15 days 12 days 13 days
song_id
096e604f7b152fad0246ae731ed8ca73 0 0 0 0 1 0 0
17e90c7b3b7ebbe4b47344fcfab2fa7a 1 0 0 0 0 0 0
31a3d0420d89c9b121bb55dbdbbeda6b 0 0 1 0 0 0 0
5a7d9d75b898cd1b19ef6941cc1ddccf 0 1 0 0 0 0 0
8a103bd3a3295fbf9b3c3bf7972db299 0 0 0 0 0 2 0
8a27d9a6c59628c991c154e8d93f412e 0 0 0 0 0 2 2
ecea5fe33e6817d09c395f2910479728 1 0 0 0 0 0 0
effb071415be51f11e845884e67c0f8c 0 0 0 1 0 0 0
f87ff481d85d2f95335ab602f38a7655 0 0 1 0 0 0 0
我正在玩一些日期,如下所示。
song_id action_type ds
0 effb071415be51f11e845884e67c0f8c 1 14 days
1 f87ff481d85d2f95335ab602f38a7655 1 13 days
2 8a27d9a6c59628c991c154e8d93f412e 2 12 days
3 ecea5fe33e6817d09c395f2910479728 1 11 days
4 31a3d0420d89c9b121bb55dbdbbeda6b 1 13 days
5 096e604f7b152fad0246ae731ed8ca73 1 15 days
6 5a7d9d75b898cd1b19ef6941cc1ddccf 1 12 days
7 8a103bd3a3295fbf9b3c3bf7972db299 2 12 days
8 17e90c7b3b7ebbe4b47344fcfab2fa7a 1 11 days
9 8a27d9a6c59628c991c154e8d93f412e 2 13 days
我想知道如何重塑它并得到类似的东西:
song_id day1_type1 day1_type2 day1_type3 day2_type1 ......... dayn_typen
(songid) (count of type1 on day1) (Nan if no count)...... (count of typen on dayn)
现在我用
action.groupby(['song_id','ds','action_type']).action_type.sum()
并得到类似的东西:
song_id ds action_type
00088cb1e6d740fcd42bc8ff2673c805 3 days 1 1
4 days 1 1
13 days 2 2
27 days 1 1
41 days 1 1
2 2
42 days 1 1
67 days 2 2
68 days 1 1
75 days 2 2
0008de587f84d8c9491502c5a5c8b466 0 days 1 4
2 4
1 days 1 17
4 days 1 6
7 days 1 10
8 days 1 5
我怎样才能重建或重塑我得到的东西以获得我想要的东西?
提前致谢。
>>> (df.groupby(['song_id', 'ds', 'action_type'])
.action_type
.sum()
.unstack(['action_type', 'ds'])
.fillna(0)
.sortlevel(level=[0, 1], axis=1))
action_type 1 2
ds 11 days 12 days 13 days 14 days 15 days 12 days 13 days
song_id
096e604f7b152fad0246ae731ed8ca73 0 0 0 0 1 0 0
17e90c7b3b7ebbe4b47344fcfab2fa7a 1 0 0 0 0 0 0
31a3d0420d89c9b121bb55dbdbbeda6b 0 0 1 0 0 0 0
5a7d9d75b898cd1b19ef6941cc1ddccf 0 1 0 0 0 0 0
8a103bd3a3295fbf9b3c3bf7972db299 0 0 0 0 0 2 0
8a27d9a6c59628c991c154e8d93f412e 0 0 0 0 0 2 2
ecea5fe33e6817d09c395f2910479728 1 0 0 0 0 0 0
effb071415be51f11e845884e67c0f8c 0 0 0 1 0 0 0
f87ff481d85d2f95335ab602f38a7655 0 0 1 0 0 0 0