提高 for 循环的速度/矢量化，包括样本函数 R

Question

我正在寻找一种快速创建矩阵的方法，该矩阵具有一定概率被选中的整数值。给定一个向量 L=c(3,4,2) 和一个具有 sum(L) 个元素的概率向量 Prob=c(0.4,0.35,0.25,0.1,0.25,0.4,0.25,0.6,0.4)，我想选择，例如，1:L[1] = 1:3 之间的一个元素，概率为 Prob[1:L[1]] = c(0.4,0.35,0.25)。这应该对 L 的所有元素执行几次，由参数 rows 确定，并存储到名为 POP.

的矩阵中

由于有两个 for 循环，我的解决方案非常慢，我正在通过矢量化或其他技术寻找性能更好的解决方案。

我对这个问题的解决方案如下：

L = c(3,4,2)
L_cum = c(0,cumsum(L)) #vector to call vector sections from Prob
Prob = c(0.4,0.35,0.25,0.1,0.25,0.4,0.25,0.6,0.4)  #probability vector for sum(L) elements
rows = 5  #number of rows of matrix POP
POP = matrix(0,rows,length(L)) 

for(i in 1:rows){
 for(j in 1:length(L)){
   POP[i,j] = sample(1:L[j],1,prob=Prob[(L_cum[j]+1):L_cum[j+1]])
 }
}

Answer 1

我会尝试：

set.seed(1234)
#set the number of extractions
n<-10
vapply(split(Prob,rep(seq_along(L),L)), 
          function(x) sample(length(x),n,replace=TRUE,prob=x),
          integer(n))
#      1 2 3
# [1,] 1 4 1
# [2,] 2 2 1
# [3,] 2 3 1
# [4,] 2 1 1
# [5,] 3 3 1
# [6,] 2 4 2
# [7,] 1 3 1
# [8,] 1 3 2
# [9,] 2 3 2
#[10,] 2 3 1

提高 for 循环的速度/矢量化，包括样本函数 R

Enhancing speed / vectorization of for loop including sample-function R

performance

for-loop

r

vectorization