根据变量在 data.frame 的每个块中提取数据

Extract data in every chunk of data.frame depending on variable

我正在尝试为我的数据的每个块(层)提取第一条记录。 我想提取每个块中负值 (Mag) 的第一次出现以及相应的时间。然后我想比较每个块中的 "times" 并找到最小值和最大值。 (这是第一件事)

我一直来到某个时候但卡住了。任何帮助,包括缩短代码,将不胜感激。谢谢!

# to make sample data
data_neg<-seq(-0.98,-1,length=300)
data_pos<-seq(0.98,1,length=300)
time<-seq(1,54,length=600)

# binding those neg and pos numbers together
tot_num<- data.frame(c(rep(time, times=4)),c(rep(cbind(data_pos,data_neg),times=4)))    
colnames(tot_num)=c("time","Mag")

# split data into chunks
n <- 1:4  
dfchunk<- split(tot_num, factor(sort(rank(row.names(tot_num))%%n)))
ext_fsw<-lapply(dfchunk[],function(x)with(x,x[Mag<0,,drop=TRUE])) 
# here I want to exctract first appearance of negative value of Mag in each chunk together with corresponding time.

作为我问题的第二部分 在@zx8754 建议后我尝试读取我的真实数据 选择第一次出现的负值进行循环并绘制结果后。但是我意识到在我的真实数据中有这样的 N.A 个值(我从我的文件夹中读取了 11 个数据,你可以看到下面的代码...)

   X1      X2
1 27.45 -0.0111
2 43.29 -0.9746
3 32.49 -0.9807
4 28.08 -0.0538
5 28.44 -0.0669
 X1      X2
1 28.71 -0.0834
2 43.29 -0.9736
3 32.49 -0.9521
4 29.16 -0.0032
5 29.70 -0.0469
 X1      X2
1 30.06 -0.0112
2 43.29 -0.9724
3 35.37 -0.0448
4 33.03 -0.0308
5 31.59 -0.0055
 X1      X2
1 35.19 -0.0476
2 43.29 -0.9712
3 39.42 -0.0171
4 40.50 -0.0143
5 36.18 -0.0395
 X1      X2
1    NA      NA
2    NA      NA
3    NA      NA
4 50.85 -0.0371
5    NA      NA
   X1  X2
   1 NA      NA
2    NA      NA
3    NA      NA
4    NA      NA
5    NA      NA
   X1 X2
1    NA      NA
2    NA      NA
3    NA      NA
4    NA      NA
5    NA      NA
     X1     X2
1    NA     NA
2    NA     NA
3 49.77 -3e-04
4    NA     NA
5    NA     NA
     X1      X2
1    NA      NA
2    NA      NA
3    NA      NA
4 43.02 -0.0465
5 45.99 -0.9793
     X1      X2
1    NA      NA
2 37.98 -0.0005
3 45.18 -0.9784
4    NA      NA
5 45.09 -0.0551
     X1      X2
1    NA      NA
2    NA      NA
3 36.90 -0.0148
4 46.17 -0.9813
5    NA      NA

这里是循环读取我的数据

data.list <- dir(pattern = "*.avgm",full.names = FALSE) # creates the list    of all the csv files in the directory

a<-1:length(data.list)
for(k in 1:length(data.list)){
data1_stt<- read.table(data.list[k],colClasses="numeric",skip=0,   fill=FALSE, sep = "", quote="\"'", dec=".", as.is = TRUE, strip.white=FALSE)
StrL1<-data1_stt[,10]
time<-data1_stt[,1]*10^-3
tot_num<- data.frame(time,StrL1)
colnames(tot_num)=c("time","Mag")
n <- 5  # split data into chunks
dfchunk<- split(tot_num, factor(sort(rank(row.names(tot_num))%%n)))
ext_fsw<-lapply(dfchunk,function(x)x[which(x$Mag<0)[1],])#which - gives the index where the conditions is TRUE, then take the 1st value [1], pass it to x as index for rownumber.
x.n <- data.frame(matrix(unlist(ext_fsw),nrow=5, byrow=T))
print(x.n)
curr<-rep(c(8,7,6,5,4,3.6,3.8,4.2,4.4,4.6,4.8),each=5)
plot(curr,x.n,pch = 20) 
}

简而言之,我任务的第二步是读取所有数据并将其绘制为 每个当前值。但我没有这样做。很抱歉,我无法将可重现的示例放在这里。由于数据中有 N.A 个值,因此总长度在负值方面有所不同。

试试这个:

ext_fsw<-lapply(dfchunk,function(x)
  x[which(x$Mag<0)[1],]
  )

which - 给出条件为 TRUE 的索引,然后取第一个值 [1],将其作为 rownumber 的索引传递给 x