将训练数据映射到元数据
mapping train data to meta data
我想转换元数据映射火车数据。因为经度和纬度不适合我的火车数据。所以我将使用元数据距离的平均值。我尝试过合并功能,但效果不佳。
示例:
1) 训练数据
station log lat
A 123 127
B 121 126
C 127 129
D 113 118
E 119 118
2) 元数据
from to
A C
B C
A D
A E
D E
3) 期望输出
from fromlog fromlat tolog tolat
A 123 127 127 129
B 121 126 127 129
A 123 127 113 118
A 123 127 119 118
D 113 118 119 118
这是一种选择
library(dplyr)
d1 <- inner_join(df1, df2, by = c('station' = 'to')) %>%
select(tolog = log, tolat = lat)
d2 <- inner_join(df1, df2, by = c('station' = 'from')) %>%
select(fromlog = log, fromlat = lat)
bind_cols(df2 %>%
select(from), d2, d1)
在 base R 中,我们可以在 lapply
调用中使用 match
,
do.call(cbind.data.frame, unlist(lapply(df2, function(x) {
inds <- match(x, df1$station)
list(log = df1$log[inds], lat = df1$lat[inds])
}), recursive = FALSE))
# from.log from.lat to.log to.lat
#1 123 127 127 129
#2 121 126 127 129
#3 123 127 113 118
#4 123 127 119 118
#5 113 118 119 118
数据
df1 <- structure(list(station = c("A", "B", "C", "D", "E"), log = c(123L,
121L, 127L, 113L, 119L), lat = c(127L, 126L, 129L, 118L, 118L
)), row.names = c(NA, -5L), class = "data.frame")
df2 <- structure(list(from = c("A", "B", "A", "A", "D"), to = c("C",
"C", "D", "E", "E")), row.names = c(NA, -5L), class = "data.frame")
我想转换元数据映射火车数据。因为经度和纬度不适合我的火车数据。所以我将使用元数据距离的平均值。我尝试过合并功能,但效果不佳。
示例:
1) 训练数据
station log lat
A 123 127
B 121 126
C 127 129
D 113 118
E 119 118
2) 元数据
from to
A C
B C
A D
A E
D E
3) 期望输出
from fromlog fromlat tolog tolat
A 123 127 127 129
B 121 126 127 129
A 123 127 113 118
A 123 127 119 118
D 113 118 119 118
这是一种选择
library(dplyr)
d1 <- inner_join(df1, df2, by = c('station' = 'to')) %>%
select(tolog = log, tolat = lat)
d2 <- inner_join(df1, df2, by = c('station' = 'from')) %>%
select(fromlog = log, fromlat = lat)
bind_cols(df2 %>%
select(from), d2, d1)
在 base R 中,我们可以在 lapply
调用中使用 match
,
do.call(cbind.data.frame, unlist(lapply(df2, function(x) {
inds <- match(x, df1$station)
list(log = df1$log[inds], lat = df1$lat[inds])
}), recursive = FALSE))
# from.log from.lat to.log to.lat
#1 123 127 127 129
#2 121 126 127 129
#3 123 127 113 118
#4 123 127 119 118
#5 113 118 119 118
数据
df1 <- structure(list(station = c("A", "B", "C", "D", "E"), log = c(123L,
121L, 127L, 113L, 119L), lat = c(127L, 126L, 129L, 118L, 118L
)), row.names = c(NA, -5L), class = "data.frame")
df2 <- structure(list(from = c("A", "B", "A", "A", "D"), to = c("C",
"C", "D", "E", "E")), row.names = c(NA, -5L), class = "data.frame")