如何整理这个数据集?
How to tidy this dataset?
我有下面要整理的数据集。
user_id topic may june july august september october
1 192775 talk 2 0 0 2 2 1
2 192775 walk 165 123 128 146 113 105
3 192775 bark 0 0 0 0 0 0
4 192775 harp 0 0 0 0 0 1
我想用 tidyr 整形成下面的格式。
user_id month talk walk bark harp
192775 may 2 165 0 0
192775 june 0 123 0 0
感谢任何帮助
与:
library(tidyr)
df %>% gather(month, val, may:october) %>% spread(topic, val)
你得到:
user_id month bark harp talk walk
1 192775 august 0 0 2 146
2 192775 july 0 0 0 128
3 192775 june 0 0 0 123
4 192775 may 0 0 2 165
5 192775 october 0 1 1 105
6 192775 september 0 0 2 113
另一种选择是使用 reshape2
-package 中的 recast
:
library(reshape2)
recast(df, user_id + variable ~ topic, id.var = c('user_id','topic'))
我有下面要整理的数据集。
user_id topic may june july august september october
1 192775 talk 2 0 0 2 2 1
2 192775 walk 165 123 128 146 113 105
3 192775 bark 0 0 0 0 0 0
4 192775 harp 0 0 0 0 0 1
我想用 tidyr 整形成下面的格式。
user_id month talk walk bark harp
192775 may 2 165 0 0
192775 june 0 123 0 0
感谢任何帮助
与:
library(tidyr)
df %>% gather(month, val, may:october) %>% spread(topic, val)
你得到:
user_id month bark harp talk walk 1 192775 august 0 0 2 146 2 192775 july 0 0 0 128 3 192775 june 0 0 0 123 4 192775 may 0 0 2 165 5 192775 october 0 1 1 105 6 192775 september 0 0 2 113
另一种选择是使用 reshape2
-package 中的 recast
:
library(reshape2)
recast(df, user_id + variable ~ topic, id.var = c('user_id','topic'))