R 耗时重复函数的改进

Question

我需要生成随机样本（仅随机排列列中的值），检查它是否满足条件并存储 "good" 个。我需要 1000 个随机样本。在其他帖子的帮助下，我编写了这段代码，但这非常耗时。有更好的解决方案吗？

ds = matrix(sample(0:1000, 120), ncol=20)

rep <- function(ds) {
    success <- FALSE
  while (!success) {
    x <- apply(ds,2,sample, replace=TRUE)
    success <- all(as.logical(colSums(x) <=  colSums(ds)))
  }
  #compute something based on random matrix that meets condition and return 
  #value
  }
  y=mean(x)
  return(y)
}
replicate(1000, {rep(ds)})

谢谢！

Answer 1

这是我在评论中写的想法 suc_samp return 一个向量的成功采样 my_rep 将这个成功的采样应用于每一列（rep 是一个基本的 R 函数，所以你可能想避免屏蔽它）。

suc_samp <- function(x) {
  while(1) {
    x_samp <- sample(x, size = length(x), TRUE)
    if(sum(x_samp) <= sum(x)) break
  }
  return(x_samp)
}

my_rep <- function(ds) {
  x <- apply(ds, 2, suc_samp)
  y <- mean(x)
  return(y)
}

ds <- matrix(sample(0:1000, 120), ncol=20)

replicate(1000, {my_rep(ds)})

R 耗时重复函数的改进

R improvement of time consuming repeat function

random

performance

r

repeat