具有串行和并行后端的 foreach 给出不同的结果

foreach with serial and parallel backend giving different results

我有一个奇怪的情况,在我第一次调用时使用带有串行和并行后端的 foreach 会给出不同的结果,但后来两个结果匹配。我使用 RNG 使相同 seed

的结果可重现

下面是解释场景的示例函数:

func <- function(ncores = NULL, seed = 1234){
  if (!is.null(ncores)){ # this block registers for parallel backend
    cl <- makeCluster(ncores)
    registerDoParallel(cl)
    registerDoRNG(seed, once = TRUE)
    on.exit(stopCluster(cl)) 
  } else {              # this block registers for serial computation
    registerDoSEQ() 
    registerDoRNG(seed, once = TRUE)
  }
  w = foreach(i = 1:10, .combine = 'c') %dorng% {
    mean(sample(1:100, 50, replace = TRUE))
  }
  attr(w, "rng") <- NULL
  return(w)
}

# first time running below 2 lines
# case 1 : serial
w1 <- func(ncores = NULL)
# Case 2 : parallel
w2 <- func(ncores= 5)
identical(w1, w2)

# second time running below 2 lines
# case 1: serial
w3 <- func(ncores = NULL)
# case 2: parallel 
w4 <- func(ncores= 5)

identical(w1, w2)
# [1] FALSE
identical(w3, w4)
# [1] TRUE

我在注册顺序过程时是否遗漏了什么?

解决方法是使用下面的表达式:

w = foreach(i = 1:10, .combine = 'c', .options.RNG=seed) %dorng% {
    mean(sample(1:100, 50, replace = TRUE))}

您可以在插图 here 中找到解释。

所以你的函数看起来像这样:

func <- function(ncores = NULL, seed = 1234){
  if (!is.null(ncores)){ # this block registers for parallel backend
    cl <- makeCluster(ncores)
    registerDoParallel(cl)
    on.exit(stopCluster(cl)) 
  } else {              # this block registers for serial computation
    registerDoSEQ() 
  }
  w = foreach(i = 1:10, .combine = 'c', .options.RNG=seed) %dorng% {
    mean(sample(1:100, 50, replace = TRUE))
  }
  attr(w, "rng") <- NULL
  return(w)
}