根据 R 中各点之间的行进距离创建一个新列

create a new column conditional on distance traveled between points in R

我正在尝试创建一个以另一列为条件的新列,有点像移动平均线或移动 window 但基于点之间的距离。以 CO2 为 399.935 的第 2 行为例。我想得到该点 100 米(行进)范围内所有点的平均值。在我的示例中(查看 CumDist 列),将选择第 1、3、4、5 行来计算平均值。 CumDist 列(*100,000 以米为单位)包含累计行驶距离。我有 5000 点,显然移动 window 的宽度(或行数)会有所不同。


structure(list(CO2 = c(399.9350305, 399.9350305, 399.9350305, 
400.0320031, 400.0320031, 400.0320031, 399.7718229, 399.7718229, 
399.7718229, 399.3855075, 399.3855075, 399.3855075, 399.4708139, 
399.4708139, 399.4708139, 400.0362474, 400.0362474, 400.0362474, 
399.7556753, 399.7556753), lon = c(-103.7093538, -103.709352, 
-103.7093492, -103.7093467, -103.7093455, -103.7093465, -103.7093482, 
-103.7093596, -103.7094074, -103.7094625, -103.7094966, -103.709593, 
-103.709649, -103.7096717, -103.7097349, -103.7097795, -103.709827, 
-103.7099007, -103.709924, -103.7099887), lat = c(49.46972027, 
49.46972153, 49.46971675, 49.46971533, 49.46971307, 49.4697124, 
49.46970636, 49.46968214, 49.46960921, 49.46955984, 49.46953621, 
49.46945809, 49.46938994, 49.46935281, 49.46924309, 49.46918635, 
49.46914762, 49.46912566, 49.46912407, 49.46913321),distDiff = c(0.000342016147509882, 
0.000191466419697602, 0.000569046320857002, 0.000240367540492089, 
0.000265977754839834, 0.000103953049523505, 0.000682968856240796, 
0.0028176007969857, 0.00882013898948418, 0.00678966015562509, 
0.00360774024245839, 0.011149423290729, 0.00859796340323456, 
0.00444526066124642, 0.0130344010874029, 0.00709037369666853, 
0.00551435348701512, 0.00587377717110946, 0.00169806309901329, 
0.00479849401022625), CumDist = c(0.000342016147509882, 0.000533482567207484, 
0.00110252888806449, 0.00134289642855657, 0.00160887418339641, 
0.00171282723291991, 0.00239579608916071, 0.00521339688614641, 
0.0140335358756306, 0.0208231960312557, 0.0244309362737141, 0.0355803595644431, 
0.0441783229676777, 0.0486235836289241, 0.0616579847163269, 0.0687483584129955, 
0.0742627119000106, 0.08013648907112, 0.0818345521701333, 0.0866330461803596
)), .Names = c("X12CO2_dry", "coords.x1", "coords.x2", "V1", 
"CumDist"), row.names = 2:21, class = "data.frame")


属于第i行的window开始于n[i],结束于m[i]-1。因此,第 i 个 window 中的 CO2 值总和为 CumCO2[m[i]]-CumCO2[n[i]]。 (请注意,CumCO2 中的索引偏移了 1,因为前导值为 0。)将此 CO2 总和除以 window 大小 m[i]-n[i] 得出值 meanCO2新专栏:

n <- sapply( df$CumDist,
               which.max( df$CumDist >= x-0.001 )

m <- sapply( df$CumDist,
               which.max( c(df$CumDist,Inf) > x+0.001 )

CumCO2 <- c( 0, cumsum(df$X12CO2) )

meanCO2 <- ( CumCO2[m] - CumCO2[n] ) / (m-n)


> n
 [1]  1  1  1  2  3  3  5  8  9 10 11 12 13 14 15 16 17 18 19 20
> m
 [1]  4  5  7  7  8  8  8  9 10 11 12 13 14 15 16 17 18 19 20 21
> meanCO2
 [1] 399.9350 399.9593 399.9835 399.9932 399.9606 399.9606 399.9453 399.7718 399.7718 399.3855 399.3855 399.3855 399.4708 399.4708 399.4708 400.0362
[17] 400.0362 400.0362 399.7557 399.7557

伙计,你用更清洁的解决方案 mra68 击败了我。


for (j in 1:nrow(DF)){#Loop through all rows of your dataset

  CO2list<-NULL ##Need to make a variable before storing to it in the loop

  for(i in 1:nrow(DF)){##Loop through all distances in the table

      if ((abs(DF$CumDist[i]-DF$CumDist[j]))<=0.001) {
        ##Check to see if difference in CumDist<=100/100000 for all entries 
        #CumDist[j] is point with the 100 meter window around it
        ##Store your CO2 entries that are within the 100 meter window to a vector

     #Get the mean of your list and store it to column named CO2AVG
