当日期在另外两个日期 R 之间时,如何将数据集连接到另一个数据集
How can I connect a data set to another when the date is between 2 other dates R
我需要合并两个数据集,但如果一个数据集的日期介于另一个数据集的两个日期之间,则必须合并行。第一个数据集 data
如下所示:
Date Weight diff Loc.nr
2013-01-24 1040 7 2
2013-01-31 1000 7 2
2013-02-07 1185 7 2
2013-02-14 915 7 2
2013-02-21 1090 7 2
2013-03-01 1065 9 2
2013-01-19 500 4 9
2013-01-23 1040 3 9
2013-01-28 415 5 9
2013-01-31 650 3 9
2013-02-04 725 4 9
2013-02-07 450 3 9
2013-02-11 550 4 9
另一个数据集matches
是这样的:
Date winning
2013-01-20 1
2013-01-27 0
2013-02-03 1
2013-02-10 0
2013-02-17 1
2013-02-24 0
我写了一段代码,将 matches
的获胜列连接到数据集 "data":
data$winning <- NA
for(i in 1:nrow(data)) {
for(j in 1:nrow(matches)) {
if((data$Date[i]-data$diff[i]) < matches$Date[j] & data$Date[i] > matches$Date[j]) {
data$winning[i] <- matches$winning[j]
}
}
}
此代码需要 3 天才能完成 运行,有更快的方法吗?
我的预期输出是:
Date Weight diff Loc.nr winning
2013-01-24 1040 7 2 1
2013-01-31 1000 7 2 0
2013-02-07 1185 7 2 1
2013-02-14 915 7 2 0
2013-02-21 1090 7 2 1
2013-03-01 1065 9 2 0
2013-01-19 500 4 9 NA
2013-01-23 1040 3 9 NA
2013-01-28 415 5 9 0
2013-01-31 650 3 9 NA
2013-02-04 725 4 9 1
2013-02-07 450 3 9 NA
2013-02-11 550 4 9 0
根据 的建议,使用 非等值连接 你可以尝试一些东西
library(data.table)
setDT(data)[, winning := setDT(matches)[data[, .(upper = Date, lower = Date - diff)],
on = .(Date < upper, Date > lower)]$winning][]
Date Weight diff Loc.nr winning
1: 2013-01-24 1040 7 2 1
2: 2013-01-31 1000 7 2 0
3: 2013-02-07 1185 7 2 1
4: 2013-02-14 915 7 2 0
5: 2013-02-21 1090 7 2 1
6: 2013-03-01 1065 9 2 0
7: 2013-01-19 500 4 9 NA
8: 2013-01-23 1040 3 9 NA
9: 2013-01-28 415 5 9 0
10: 2013-01-31 650 3 9 NA
11: 2013-02-04 725 4 9 1
12: 2013-02-07 450 3 9 NA
13: 2013-02-11 550 4 9 0
我需要合并两个数据集,但如果一个数据集的日期介于另一个数据集的两个日期之间,则必须合并行。第一个数据集 data
如下所示:
Date Weight diff Loc.nr
2013-01-24 1040 7 2
2013-01-31 1000 7 2
2013-02-07 1185 7 2
2013-02-14 915 7 2
2013-02-21 1090 7 2
2013-03-01 1065 9 2
2013-01-19 500 4 9
2013-01-23 1040 3 9
2013-01-28 415 5 9
2013-01-31 650 3 9
2013-02-04 725 4 9
2013-02-07 450 3 9
2013-02-11 550 4 9
另一个数据集matches
是这样的:
Date winning
2013-01-20 1
2013-01-27 0
2013-02-03 1
2013-02-10 0
2013-02-17 1
2013-02-24 0
我写了一段代码,将 matches
的获胜列连接到数据集 "data":
data$winning <- NA
for(i in 1:nrow(data)) {
for(j in 1:nrow(matches)) {
if((data$Date[i]-data$diff[i]) < matches$Date[j] & data$Date[i] > matches$Date[j]) {
data$winning[i] <- matches$winning[j]
}
}
}
此代码需要 3 天才能完成 运行,有更快的方法吗?
我的预期输出是:
Date Weight diff Loc.nr winning
2013-01-24 1040 7 2 1
2013-01-31 1000 7 2 0
2013-02-07 1185 7 2 1
2013-02-14 915 7 2 0
2013-02-21 1090 7 2 1
2013-03-01 1065 9 2 0
2013-01-19 500 4 9 NA
2013-01-23 1040 3 9 NA
2013-01-28 415 5 9 0
2013-01-31 650 3 9 NA
2013-02-04 725 4 9 1
2013-02-07 450 3 9 NA
2013-02-11 550 4 9 0
根据
library(data.table)
setDT(data)[, winning := setDT(matches)[data[, .(upper = Date, lower = Date - diff)],
on = .(Date < upper, Date > lower)]$winning][]
Date Weight diff Loc.nr winning 1: 2013-01-24 1040 7 2 1 2: 2013-01-31 1000 7 2 0 3: 2013-02-07 1185 7 2 1 4: 2013-02-14 915 7 2 0 5: 2013-02-21 1090 7 2 1 6: 2013-03-01 1065 9 2 0 7: 2013-01-19 500 4 9 NA 8: 2013-01-23 1040 3 9 NA 9: 2013-01-28 415 5 9 0 10: 2013-01-31 650 3 9 NA 11: 2013-02-04 725 4 9 1 12: 2013-02-07 450 3 9 NA 13: 2013-02-11 550 4 9 0