为现有数据框中具有连续序列的每一对行创建一行的新数据框

Question

我有一个现有的数据框，其中每一行代表一个地理点。每个点定义一个唯一的ID，一个用户定义的序号和它的一对地理坐标，如下：

id  Sequence Latitude Longitude Trajectory
544        0 41.37990   2.17220          1
545        1 41.37874   2.17104          1
546        0 41.37867   2.17092          2
547        1 41.37863   2.17084          2
548        2 41.37857   2.17073          2
549        3 41.37853   2.17065          2

请注意，这些点来自一系列轨迹，其中每个轨迹由序列变量后面的连续点形成。我还有一个现有变量 'Trajectory' 对这些连续的点进行分组。所以，在这个例子中，有一个两点的轨迹，后面跟着一个 4 点的轨迹。

我需要创建一个新的数据框（我们称它为“线”），其中我需要一行，每条线连接同一轨迹中的两个连续点。每行需要包含两个点索引及其两对坐标，理想情况下，还包含轨迹编号。所以前面例子的结果是：

Line  id1 Latitude1 Longitude1 id2 Latitude2 Longitude2 Trajectory
0     544  41.37990    2.17220 545  41.37874    2.17104          1
1     546  41.37867    2.17092 547  41.37863    2.17084          2
2     547  41.37863    2.17084 548  41.37857    2.17073          2
3     548  41.37857    2.17073 549  41.37853    2.17065          2

我一直在尝试使用 tidyverse、dplyr 和类似的库来避免使用 for 循环，因为我知道它们效率不高并且现有数据框有数百万个点，但没有任何效果，我找不到任何类似的问题。

欢迎任何有关如何解决问题的帮助。提前致谢！

Answer 1

当原始数据按轨迹和序列排序时，如你的例子，我可以将经度，纬度和轨迹列移动一行，使经度1/2，纬度1/2和轨迹1/2，然后提取具有相同轨迹的行。

假设原始数据帧是“points”：

n = nrow(points)
temp <- data.frame(
  id1 = points$id[1:(n-1)],
  Latitude1 = points$Latitude[1:(n-1)],
  Longitude1 = points$Longitude[1:(n-1)],
  id2 = points$id[2:n],
  Latitude2 = points$Latitude[2:n],
  Longitude2 = points$Longitude[2:n],
  Trajectory1 = points$Trajectory[1:(n-1)],
  Trajectory2 = points$Trajectory[2:n]
)
temp=temp[Trajectory1==Trajectory2,]
n = nrow(temp)
ret <- data.frame(
  Line = c(0:(n-1)),
  id1 = temp$id1,
  Latitude1 = temp$Latitude1,
  Longitude1 = temp$Longitude1,
  id2 = temp$id2,
  Latitude2 = temp$Latitude2,
  Longitude2 = temp$Longitude2,
  Trajectory = temp$Trajectory1
)

ret 输出。

为现有数据框中具有连续序列的每一对行创建一行的新数据框

Create new dataframe with one row for each pair of rows with consecutive sequence in existing dataframe

r

point

sequence

dataframe