带有列表 R 的 for 循环

Question

我想在 for 循环中创建两个数据框列表，但我不能使用赋值：

dat <- data.frame(name = c(rep("a", 10), rep("b", 13)),
                  x = c(1,3,4,4,5,3,7,6,5,7,8,6,4,3,9,1,2,3,5,4,6,3,1),
                  y = c(1.1,3.2,4.3,4.1,5.5,3.7,7.2,6.2,5.9,7.3,8.6,6.3,4.2,3.6,9.7,1.1,2.3,3.2,5.7,4.8,6.5,3.3,1.2))

a <- dat[dat$name == "a",]
b <- dat[dat$name == "b",]

samp <- vector(mode = "list", length = 100)
h <- list(a,b)
hname <- c("a", "b")

for (j in 1:length(h)) {
  for (i in 1:100) {
    samp[[i]] <- sample(1:nrow(h[[j]]), nrow(h[[j]])*0.5)
    assign(paste("samp", hname[j], sep="_"), samp[[i]])
  }
}

我得到的不是名为 samp_a 和 samp_b 的列表，而是包含第 100 个样本结果的向量。我想要一个列表 samp_a 和 samp_b，其中包含 dat[dat$name == a,] 和 dat[dat$name == a,] 的所有不同样本。

我该怎么做？

Answer 1

如何创建两个不同的列表并避免使用分配：

Option 1:

# create empty list
samp_a <-list()
samp_b <- list()

for (j in seq(h)) {

    # fill samp_a list
    if(j == 1){
        for (i in 1:100) {
            samp_a[[i]] <- sample(1:nrow(h[[j]]), nrow(h[[j]])*0.5)
        }
      # fill samp_b list
    } else if(j == 2){
        for (i in 1:100) {
            samp_b[[i]] <- sample(1:nrow(h[[j]]), nrow(h[[j]])*0.5)
        }
    }
}

你也可以使用 assign，简短的回答：

Option 2:

for (j in seq(hname)) {
    l = list()
    for (i in 1:100) {
        l[[i]] <- sample(1:nrow(h[[j]]), nrow(h[[j]])*0.5)
    }
    assign(paste0('samp_', hname[j]), l)
    rm(l)
}

Answer 2

您可以使用 rep 函数轻松地为此使用 lapply。除非你想要随机 x，搭配随机 y。这将保持现有的配对顺序。

dat <- data.frame(name = c(rep("a", 10), rep("b", 13)),
              x = c(1,3,4,4,5,3,7,6,5,7,8,6,4,3,9,1,2,3,5,4,6,3,1),
              y = c(1.1,3.2,4.3,4.1,5.5,3.7,7.2,6.2,5.9,7.3,8.6,6.3,4.2,3.6,9.7,1.1,2.3,3.2,5.7,4.8,6.5,3.3,1.2))

a <- dat[dat$name == "a",]
b <- dat[dat$name == "b",]

h <- list(a,b)
hname <- c("a", "b")

testfunc <- function(df) {
#df[sample(nrow(df), nrow(df)*0.5), ] #gives you the values in your data frame
sample(nrow(df), nrow(df)*0.5) # just gives you the indices
}

lapply(h, testfunc) # This gives you the standard lapply format, and only gives one a, and one b
samp <- lapply(rep(h, 100), testfunc) # This shows you how to replicate the function n times, giving you 100 a and 100 b data.frames in a list

samp_a <- samp[c(TRUE, FALSE)] # Applies a repeating T/F vector, selecting the odd data.frames, which in this case are the `a` frames.
samp_b <- samp[c(FALSE, TRUE)] # And here, the even data.frames, which are the `b` frames.

带有列表 R 的 for 循环

for loop with lists R

for-loop

r

list

assign