将多个函数应用于 return 数据框的矩阵列表

Apply multiple functions to list of matrices to return data frame

我有一个这样的数据框:

df<- data.frame(year= c(rep("2004", 10), rep("2005", 10), rep("2006", 10), rep("2007", 10)), 
            lev1=c("A", "B", "C", "A", "D", "E", "D", "D", "B","B","C", "A","F","E","A","B",
                       "A", "B","C", "A", "D", "E", "D", "D", "B","B","C", "A","F","E","A", "B", "C", "A", "D","A","F","E","A","B" ), 
            lev2=c("X", "Y", "Z", "X", "W", "T", "W", "W", "Y","Y","Z", "T","U","V","Y","Y",
                      "W", "X","T", "W", "X", "Y", "Z", "X", "W", "T", "W", "W", "Y","Y","Z", "T","U","V","Y","Y",
                   "W", "X","T", "W"))

并且有代码来制作每年的矩阵列表 (Results)。 lev1 成为行,lev2 成为列。矩阵内的值是两者同时出现的次数。

sublist=NA
for (i in unique(df$year)){   
sublist[i]<-list(subset(df, df[,1] == i)) 
print(i)
}
Results = list()
for (i in 1: length(unique(sublist))){ 
if (length(sublist[[i]]) > 1 & length(sublist[[i]]) > 1 ){
rows<-unique(sublist[[i]][[2]]) 
cols<-unique(sublist[[i]][[3]]) 
matrix1<- matrix(nrow = length(rows), ncol = length(cols))
df = data.frame(sublist[[i]])
for (k in 1: length(rows)){
  sub_lev1<- subset(df,lev1 == rows[k]) 
  for (j in 1:length(cols)){ 
    sub_lev2<-subset(sub_lev1, lev2 == cols[j]) 
    matrix1[k,j]<-length(sub_lev2[,3])
  }
}
colnames(matrix1) <- cols
rownames(matrix1) <- rows
Results[[i]] = matrix1
}else{next}
}
Results

我想 运行 在列表的每个元素上 运行 一个函数 (library("bipartite") networklevel()),return 多个网络索引的多个值。下面我为每个矩阵单独做。

d1<-networklevel(Results[[2]])
d2<-networklevel(Results[[3]])
d3<-networklevel(Results[[4]])
d4<-networklevel(Results[[5]])

所需的输出是一个数据框,其中包括年份、网络索引的名称以及每个网络索引的值:

d1<-data.frame(as.list(d1))
d1<- melt(d1)
d1$year<-rep("2004", length(d1))

d2<-data.frame(as.list(d2))
d2<- melt(d2)
d2$year<-rep("2005", length(d2))

d3<-data.frame(as.list(d3))
d3<- melt(d3)
d3$year<-rep("2006", length(d3))

d4<-data.frame(as.list(d4))
d4<- melt(d4)
d4$year<-rep("2007", length(d4))

output<- rbind(d1,d2,d3, d4)

我有几个问题:1) 由于某种原因,第一个矩阵 return 上面的循环是 NULL。我该如何纠正这个问题? 2) 当矩阵在 Results 中索引时,它们不是由 year 索引,而是 1-5。我想调整循环以便对年份名称进行索引。我相信这将有助于在下游创建输出 df。

我尝试了以下 return 列表中每个元素的网络索引,但没有成功:

output<- lapply(mylist, FUN= function(x) networklevel(x)

我将不胜感激运行宁networklevel一次对列表的所有元素的任何帮助。 networklevel 的默认值是 return 多个网络索引,所以我需要一个解决方案 运行 networklevel 和 return 每个矩阵的所有这些索引到一个有组织的指定矩阵来源年份的数据框。在我的实际数据集中,我有超过 20 年的数据,因此找到一个解决方案来阻止我对每个 year/matrix 分别执行此操作将是最有效的。

你的第一个问题:

1) for some reason the loop above returns the first matrix as NULL. How do I correct this?

sublist <- NA 更改为 sublist <- NULL ,当您 运行 您的 for 循环时,NA 不会从对象 sublist 中删除,这就是导致第一个问题的原因矩阵为 NULL。 R 尝试对 year == NA 进行子集化,但这行不通。

第二期:

2) When the matrices are indexed in Results they are not indexed by year, rather 1-5. I would like to adjust the loop so that the name of the year is indexed.

我会尝试这样的事情 names(Results) <- c("2004", "2005", "2006", "2007")

第三期:

looping output

在您的 lapply 中,您不需要创建 function(x) 只需调用 networklevel 像这样 output <- lapply(Results, bipartite::networklevel)

然后你可以做这样的事情把它变成df/matrix:

#get to matrix
dfoutput <- do.call(rbind, output)
#add row names as variable - in your case it is year of analysis
dfoutput2 <- cbind(dfoutput, nms = row.names(dfoutput))
#convert to df if needed
dfoutput3 <- as.data.frame(dfoutput2)