如何识别不同点列表的最近点并将 ID 存储在列表中

Question

我有两个数据帧，它们在不同时间点具有不同 ID 的 xy 坐标。我想做的是确定前一年的哪个点最接近当年的点并将该数据存储在列表中。因此对于此示例数据：

oldnames <- c('A', 'B', 'C')
oldx <- c(0,5,10)
oldy <- c(0,5,10)
olddf <- data.frame(oldnames, oldx, oldy)

newnames <- c('D','E','F')
newx <- c(1, 6, 11)
newy <- c(1, 6, 11)
newdf <- data.frame(newnames, newx, newy)

我想生成一个如下所示的列表：

names  closest
D      A
E      B
F      C

我一直在尝试使用 apply（如下所示）来执行此操作，但目前它给了我一条错误消息：（mutate_impl(.data, dots) 中的错误：二元运算符的非数字参数)

有没有人有什么想法？

closestdf <- data.frame()
apply(newdf, 1, function(row) {
    name <- row["names"]
    xID <- row["x"]  
    yID <- row["y"]
    closest <- olddf %>%
               mutate(length = sqrt((xID - oldx)^2 + (yID - oldy)^2)) %>%
               mutate(rank = min_rank(length)) %>%
               filter(rank == '1')%>%
               mutate(total =  '1')
               closestdf <- rbind(closest, closestdf)
})

干杯！

Answer 1

无需应用调用，我们可以在 mutate 中 purrr 代替：

library(tidyverse)
newdf %>% 
  mutate(closest = 
           map2_chr(newx, newy, 
                    ~as.character(olddf$oldnames)[which.min((.x - olddf$oldx) ^ 2 + (.y - olddf$oldy) ^ 2)]
           )
  )

给出：

  newnames newx newy closest
1        D    1    1       A
2        E    6    6       B
3        F   11  101       C

如果我们不需要实际距离，就没有理由执行平方根运算。

或通过中间步骤更清晰和详细：

newdf %>% 
  mutate(dists = map2(newx, newy, ~(.x - olddf$oldx) ^ 2 + (.y - olddf$oldy) ^ 2),
         ids = map_dbl(dists, which.min),
         closest = olddf$oldnames[ids])

给出：

  newnames newx newy             dists ids closest
1        D    1    1        2, 32, 162   1       A
2        E    6    6         72, 2, 32   2       B
3        F   11  101 10322, 9252, 8282   3       C

如何识别不同点列表的最近点并将 ID 存储在列表中

How can I identify the closest point for a list of different points and store the ID in a list

for-loop

r

apply

dplyr