将日期的第一个日历出现设置为数字“1”,然后逐日移动
Set first calendrical ocurrence of a date as numeric "1" and then move on day by day
我的数据集如下所示:
game_data <- data.frame(player = c(1,1,1,1,2,2,2,2), dateday = c("2015-04-08","2015-05-08","2015-05-10","2015-06-28","2015-09-01","2015-09-02","2015-09-03","2015-10-11"), points = c(20,80,140,230,40,60,98,102))
game_data
player dateday points
1 1 2015-04-08 20
2 1 2015-05-08 80
3 1 2015-05-10 140
4 1 2015-06-28 230
5 2 2015-09-01 40
6 2 2015-09-02 60
7 2 2015-09-03 98
8 2 2015-10-11 102
我想要一个数据集,该数据集对每个日期的每个用户都有一个观察值,从该用户的第一个日期条目开始,并将其称为“1”,然后逐日计数。
它应该是这样的(希望我算对了...)
game_data_new <- data.frame(player = c(1,1,1,1,2,2,2,2), dateday = c(1,2,4,53,1,2,3,41), points = c(20,80,140,230,40,60,98,102))
game_data_new
player dateday points
1 1 1 20
2 1 2 80
3 1 4 140
4 1 53 230
5 2 1 40
6 2 2 60
7 2 3 98
8 2 41 102
使用 dplyr
包非常简单。将 dateday
转换为 Date
对象,该对象支持减去两个日期以获取以天为单位的时差,然后为每个玩家获取从第 0 天开始的天差并加 1.
library(dplyr)
game_data_new <- game_data %>%
mutate(dateday = as.Date(dateday)) %>%
group_by(player) %>%
mutate(dateday = 1 + as.numeric(dateday - min(dateday)))
基本解决方案:
game_data$dateday <- 1 + as.numeric(ave(game_data$dateday, game_data$player, FUN = function(days)c(0, diff(as.Date(days, format = "%Y-%m-%d")))))
#[1] 1 31 3 50 1 2 2 39
game_data <- data.frame(
player = c(1,1,1,1,2,2,2,2),
dateday = c("2015-04-08","2015-05-08","2015-05-10","2015-06-28","2015-09-01","2015-09-02","2015-09-03","2015-10-11"),
points = c(20,80,140,230,40,60,98,102),
stringsAsFactors = FALSE)
我的数据集如下所示:
game_data <- data.frame(player = c(1,1,1,1,2,2,2,2), dateday = c("2015-04-08","2015-05-08","2015-05-10","2015-06-28","2015-09-01","2015-09-02","2015-09-03","2015-10-11"), points = c(20,80,140,230,40,60,98,102))
game_data
player dateday points
1 1 2015-04-08 20
2 1 2015-05-08 80
3 1 2015-05-10 140
4 1 2015-06-28 230
5 2 2015-09-01 40
6 2 2015-09-02 60
7 2 2015-09-03 98
8 2 2015-10-11 102
我想要一个数据集,该数据集对每个日期的每个用户都有一个观察值,从该用户的第一个日期条目开始,并将其称为“1”,然后逐日计数。
它应该是这样的(希望我算对了...)
game_data_new <- data.frame(player = c(1,1,1,1,2,2,2,2), dateday = c(1,2,4,53,1,2,3,41), points = c(20,80,140,230,40,60,98,102))
game_data_new
player dateday points
1 1 1 20
2 1 2 80
3 1 4 140
4 1 53 230
5 2 1 40
6 2 2 60
7 2 3 98
8 2 41 102
使用 dplyr
包非常简单。将 dateday
转换为 Date
对象,该对象支持减去两个日期以获取以天为单位的时差,然后为每个玩家获取从第 0 天开始的天差并加 1.
library(dplyr)
game_data_new <- game_data %>%
mutate(dateday = as.Date(dateday)) %>%
group_by(player) %>%
mutate(dateday = 1 + as.numeric(dateday - min(dateday)))
基本解决方案:
game_data$dateday <- 1 + as.numeric(ave(game_data$dateday, game_data$player, FUN = function(days)c(0, diff(as.Date(days, format = "%Y-%m-%d")))))
#[1] 1 31 3 50 1 2 2 39
game_data <- data.frame(
player = c(1,1,1,1,2,2,2,2),
dateday = c("2015-04-08","2015-05-08","2015-05-10","2015-06-28","2015-09-01","2015-09-02","2015-09-03","2015-10-11"),
points = c(20,80,140,230,40,60,98,102),
stringsAsFactors = FALSE)