获取与日期 'Around' 7 天前关联的分数
Fetching a score associated with a date 'Around' 7 days ago
这就是我的数据框的样子。最右边(第 4)列是我想要的列。对于一个给定的名字,我试图从 7 天前得出该人的分数。如果 7 天前没有确切的日期,那么我想要与最接近的日期(一行的日期 - 7 天)关联的分数。
library(data.table)
dt <- fread('
Name Score Date ScoreAround7DaysAgo
John 9 2016-01-01 NA
John 6 2016-01-10 9
John 3 2016-01-17 6
John 5 2016-01-18 6
Tom 9 2016-01-01 NA
Tom 6 2016-01-10 9
Tom 3 2016-01-17 6
Tom 5 2016-01-18 6
')
dt[, Date := as.IDate(Date)]
我试过dt[dt,roll=7+nearest]
没用。谢谢你的帮助。
这个有效:
dt[, DateLag := Date - 7L ]
w = dt[dt, which = TRUE, on = c("Name", Date = "DateLag"), roll = "nearest"]
dt[ , `:=`(ScoreLag = Score[replace(w, w == .I, NA_integer_)], DateLag = NULL)]
Name Score Date ScoreAround7DaysAgo ScoreLag
1: John 9 2016-01-01 NA NA
2: John 6 2016-01-10 9 9
3: John 3 2016-01-17 6 6
4: John 5 2016-01-18 6 6
5: Tom 9 2016-01-01 NA NA
6: Tom 6 2016-01-10 9 9
7: Tom 3 2016-01-17 6 6
8: Tom 5 2016-01-18 6 6
它找到最接近 Date-7
的日期,但如果再次 Date
相同则丢弃它。
dt[, val := .SD[.(Name = Name, Date = Date - 7), on = c('Name', 'Date'), roll = 'nearest',
c(NA, tail(Score, -1)), by = Name]$V1]
dt
# Name Score Date ScoreAround7DaysAgo val
#1: John 9 2016-01-01 NA NA
#2: John 6 2016-01-10 9 9
#3: John 3 2016-01-17 6 6
#4: John 5 2016-01-18 6 6
#5: Tom 9 2016-01-01 NA NA
#6: Tom 6 2016-01-10 9 9
#7: Tom 3 2016-01-17 6 6
#8: Tom 5 2016-01-18 6 6
这就是我的数据框的样子。最右边(第 4)列是我想要的列。对于一个给定的名字,我试图从 7 天前得出该人的分数。如果 7 天前没有确切的日期,那么我想要与最接近的日期(一行的日期 - 7 天)关联的分数。
library(data.table)
dt <- fread('
Name Score Date ScoreAround7DaysAgo
John 9 2016-01-01 NA
John 6 2016-01-10 9
John 3 2016-01-17 6
John 5 2016-01-18 6
Tom 9 2016-01-01 NA
Tom 6 2016-01-10 9
Tom 3 2016-01-17 6
Tom 5 2016-01-18 6
')
dt[, Date := as.IDate(Date)]
我试过dt[dt,roll=7+nearest]
没用。谢谢你的帮助。
这个有效:
dt[, DateLag := Date - 7L ]
w = dt[dt, which = TRUE, on = c("Name", Date = "DateLag"), roll = "nearest"]
dt[ , `:=`(ScoreLag = Score[replace(w, w == .I, NA_integer_)], DateLag = NULL)]
Name Score Date ScoreAround7DaysAgo ScoreLag
1: John 9 2016-01-01 NA NA
2: John 6 2016-01-10 9 9
3: John 3 2016-01-17 6 6
4: John 5 2016-01-18 6 6
5: Tom 9 2016-01-01 NA NA
6: Tom 6 2016-01-10 9 9
7: Tom 3 2016-01-17 6 6
8: Tom 5 2016-01-18 6 6
它找到最接近 Date-7
的日期,但如果再次 Date
相同则丢弃它。
dt[, val := .SD[.(Name = Name, Date = Date - 7), on = c('Name', 'Date'), roll = 'nearest',
c(NA, tail(Score, -1)), by = Name]$V1]
dt
# Name Score Date ScoreAround7DaysAgo val
#1: John 9 2016-01-01 NA NA
#2: John 6 2016-01-10 9 9
#3: John 3 2016-01-17 6 6
#4: John 5 2016-01-18 6 6
#5: Tom 9 2016-01-01 NA NA
#6: Tom 6 2016-01-10 9 9
#7: Tom 3 2016-01-17 6 6
#8: Tom 5 2016-01-18 6 6