如何使用 for 循环更新协方差矩阵中的对角线？

Question

我已经使用 mvrnorms 为两个变量创建了模拟数据，我想在一个循环中关联这些变量 0、.5、.7 和 .9。但是每次我运行我的 for 循环我只能关联 .9 的值，而不是任何其他关联条件。

library(MASS) #library I needed to create simulated data with mvrnorms

num_iter <- 75
N <- 30                       # setting my sample size
mu <- c(50.5, 10.5)           # setting the std
R <- c(0,.5,.7,.9)            # this vector defines the different correlation conditions I will add

# saving files
dir.create("simulated1data") # This creates a directory to store files

# performing 75 iterations and so there should be 75 data files in the folder I made
for(i in 1:num_iter){
  for(j in 1:4){
    cov <- matrix(c(1,R[j],R[j],1),2,2)
    x <- mvrnorm(N,mu,cov)
    write.table(x, file=paste("simulated1data/simdata_",i,"_",j,".txt",sep="")) # writing to separate txt file
  }
}

根据我的理解，我的（对于 1:4 中的 j）没有适当地运行遍历我的 R 向量中的所有第 j 个值，这就是 X 中的变量总是在处相关的原因。 9.有谁知道如何解决这一问题？感谢您的宝贵时间！

Answer 1

要分配 R 的值，请预先创建一个 cov 矩阵并使用逻辑索引矩阵 imat.

第一个代码块就像问题中的那样。

library(MASS) #library I needed to create simulated data with mvrnorms

num_iter <- 75
N <- 30                       # setting my sample size
mu <- c(50.5, 10.5)           # setting the std
R <- c(0, 0.5, 0.7, 0.9)      # this vector defines the different correlation conditions I will add

这是为了在我的系统上进行测试。

# saving files
dirsimdata <- "~/tmp/simulated1data"
dir.create(dirsimdata) # This creates a directory to store files

现在 cov 和 imat 矩阵。

# index matrix used to assign values from R
imat <- matrix(c(FALSE, TRUE, TRUE, FALSE), nrow = 2)
# start with all 1's
cov <- matrix(1, nrow = 2, ncol = 2)

最后，双 for 循环。

# performing 75 iterations and so there should be 75 data files in the folder I made
for(i in 1:num_iter){
  for(j in 1:4){
    cov[imat] <- R[j]
    x <- mvrnorm(N, mu, cov)
    flname <- paste0("simdata_", i, "_", j, ".txt")
    flname <- file.path(dirsimdata, flname)
    write.table(x, file = flname) # writing to separate txt file
  }
}

Answer 2

我没有发现您的代码有任何错误。您错误地将 mu 识别为标准差，但它是每个变量的平均值，而 R 是协方差而不是相关性。您将协方差矩阵中每个变量的标准差设置为 1。如果我在进入循环之前设置 num_iter <- 2 并使用 set.seed(42)，考虑到样本量仅为 30，我会得到合理的相关性：

cor(read.table("simulated1data/simdata_1_1.txt"))
#          V1       V2
# V1 1.000000 0.204011
# V2 0.204011 1.000000
cor(read.table("simulated1data/simdata_1_2.txt"))
#           V1        V2
# V1 1.0000000 0.2706851
# V2 0.2706851 1.0000000
cor(read.table("simulated1data/simdata_1_3.txt"))
#           V1        V2
# V1 1.0000000 0.6727047
# V2 0.6727047 1.0000000
cor(read.table("simulated1data/simdata_1_4.txt"))
#           V1        V2
# V1 1.0000000 0.9306898
# V2 0.9306898 1.0000000
cor(read.table("simulated1data/simdata_2_1.txt"))
#            V1         V2
# V1 1.00000000 0.06184222
# V2 0.06184222 1.00000000
cor(read.table("simulated1data/simdata_2_2.txt"))
#           V1        V2
# V1 1.0000000 0.3686962
# V2 0.3686962 1.0000000
cor(read.table("simulated1data/simdata_2_3.txt"))
#           V1        V2
# V1 1.0000000 0.7660853
# V2 0.7660853 1.0000000
cor(read.table("simulated1data/simdata_2_4.txt"))
#           V1        V2
# V1 1.0000000 0.8589621
# V2 0.8589621 1.0000000

如何使用 for 循环更新协方差矩阵中的对角线？

How can I use a for loop to update diagonals in my covariance matrix?

for-loop

r

matrix

correlation