如何计算 R 中各行列的纬度和经度之间的距离？

Question

我的 df 是这样的：

    bid        ts    latitude  longitude
1  827566 1999-10-07 42.40944 -88.17822
2  827566 2013-04-11 41.84740 -87.63126
3 1902966 2012-05-02 45.52607 -94.20649
4 1902966 2013-03-25 41.94083 -87.65852
5 3211972 2012-08-14 43.04786 -87.96618
6 3211972 2013-08-02 41.88258 -87.63760

我想创建一个新的 df 来计算每个连续点的时间和距离差异。我想向下计算按相同出价分组的行。我使用以下 for 循环来完成此操作：

library(geosphere)
   lengthdata <- nrow(twopoint)
   twopointdata <- data.frame(matrix(ncol = 4, nrow =lengthdata))
   x <- c("bid", "time", "d", "dsq")
   colnames(twopointdata) <- x
   n <- numeric()
   n <- 1

   for (i in 1:lengthdata)
   {
     if (twopoint[i+1,1] == twopoint[i,1]) 
     {
       twopointdata[n,1] <- twopoint[i+1,1]
       twopointdata[n,2] <- as.numeric(twopoint[i+1,5]-twopoint[i,5])
       twopointdata[n,3] <- distm(c(twopoint[i+1,10], twopoint[i+1,9]), 
                              c(twopoint[i,10],twopoint[i,9]), fun = 
                                   distHaversine)
       twopointdata[n,4] <- twopoint[n,3]^2
       n <- n+1
     }

   }
   attach(twopointdata)
   head(twopointdata)

(为了显示更清楚，我去掉了一些行号，部分列号被打掉了)

我的结果是这样的：

      bid time    d          dsq
1  827566 4935  77159.8 5.677201e+11
2 1902966  327 660457.0 6.436004e+16
3 3211972  353 132494.8 3.540118e+12
4 3692174 4722 727359.6 6.394166e+16
5 4404655 4833 201644.7 1.092944e+13
6 6644203 4518 210485.9 6.721980e+16

它有每个数据点的 id，每个数据点之间的时间差，从 long 和 lat 计算的距离，以及距离的平方。问题：它非常慢，最终我将在非常大的数据集上执行此操作。

我能够在没有 for 循环的情况下成功地做到这一点，使用 dplyr 的时差是这样的：

 library(dplyr)
 library(geosphere)
 latlongdata2 <- latlongdata 
 latlongdata2 %>%
  group_by(bid)%>%
  transmute(
    bid = bid,
    t = c(NA,diff(ts)))

我不知道如何处理纬度和经度，因为与 ts 值不同，它们位于两个不同的列中。有人有什么建议吗？

P.S。该项目的总体目标是对数据进行均方位移分析。

Answer 1

我认为你把它复杂化了一点。我希望 geosphere::distHaversine 有一个更直观的调用方法（类似于 diff），但解决它并不难：

dat <- read.table(text = "  bid        ts    latitude  longitude
 827566 1999-10-07 42.40944 -88.17822
 827566 2013-04-11 41.84740 -87.63126
1902966 2012-05-02 45.52607 -94.20649
1902966 2013-03-25 41.94083 -87.65852
3211972 2012-08-14 43.04786 -87.96618
3211972 2013-08-02 41.88258 -87.63760", header = TRUE, stringsAsFactors = FALSE)
dat$ts <- as.Date(dat$ts)

library(dplyr)
library(geosphere)
group_by(dat, bid) %>%
  mutate(
    d = c(NA,
          distHaversine(cbind(longitude[-n()], latitude[-n()]),
                        cbind(longitude[  -1], latitude[  -1]))),
    dts = c(NA, diff(ts))
  ) %>%
  ungroup() %>%
  filter( ! is.na(d) )
# # A tibble: 3 × 6
#       bid         ts latitude longitude         d   dts
#     <int>     <date>    <dbl>     <dbl>     <dbl> <dbl>
# 1  827566 2013-04-11 41.84740 -87.63126  77159.35  4935
# 2 1902966 2013-03-25 41.94083 -87.65852 660457.41   327
# 3 3211972 2013-08-02 41.88258 -87.63760 132494.65   353

如何计算 R 中各行列的纬度和经度之间的距离？

How can I calculate the distance between latitude and longitude along rows of columns in R?

database

r

geosphere