foreach with Rcpp in R package error: <simpleError in .Call("<function_name>"..."<function name>" not available for .Call() for package "<package>">

foreach with Rcpp in R package error: <simpleError in .Call("<function_name>"..."<function name>" not available for .Call() for package "<package>">

我正在尝试并行化 Rcpp 代码。从 this post,我能够让我的 MRE 达到 运行 并通过采购函数产生预期的输出:

> Rcpp::sourceCpp("src/rnorm_c.cpp")
> source("~/<path to project folder>/rnormpar/R/normal_mat.R")
> norm_mat_par()
[[1]]
           [,1]
[1,] -0.1117342

[[2]]
           [,1]
[1,] 0.05094005

[[3]]
          [,1]
[1,] 0.1137641

[[4]]
          [,1]
[1,] 0.8624004

[[5]]
          [,1]
[1,] 0.7821107

但是,在包中构建并 运行ning 函数后,输出变为:

Restarting R session...

> library(rnormpar)
> rnormpar::norm_mat_par()
[[1]]
<simpleError in .Call("_rnormpar_rnorm_n", PACKAGE = "rnormpar",     n, mu, sd): "_rnormpar_rnorm_n" not available for .Call() for package "rnormpar">

[[2]]
<simpleError in .Call("_rnormpar_rnorm_n", PACKAGE = "rnormpar",     n, mu, sd): "_rnormpar_rnorm_n" not available for .Call() for package "rnormpar">

[[3]]
<simpleError in .Call("_rnormpar_rnorm_n", PACKAGE = "rnormpar",     n, mu, sd): "_rnormpar_rnorm_n" not available for .Call() for package "rnormpar">

[[4]]
<simpleError in .Call("_rnormpar_rnorm_n", PACKAGE = "rnormpar",     n, mu, sd): "_rnormpar_rnorm_n" not available for .Call() for package "rnormpar">

[[5]]
<simpleError in .Call("_rnormpar_rnorm_n", PACKAGE = "rnormpar",     n, mu, sd): "_rnormpar_rnorm_n" not available for .Call() for package "rnormpar">

这是我的 MRE 的代码。它由两个脚本组成。首先是Rcpp代码:

#include <RcppArmadillo.h>
//[[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;

// function to generate a single sample from the standard normal distribution
//[[Rcpp::export]]
double rnorm1() {
  return (double)arma::vec(1, arma::fill::randn)(0, 0);
}

// function to return a vector of n samples from the normal distribution
//[[Rcpp::export]]
arma::vec rnorm_n(int n = 1, double mu = 0, double sd = 1){

  arma::vec res(n);

  for (int j = 0; j < n; j++){
    res(j) = rnorm1();
  }

  res = res * sd + mu;

  return res;
}

第二个是R代码:

# generates a matrix distributed independent normal
# takes n, p, mean vector, and sd vector representing the diagonal of the
# covariance matrix
#' Normal matrix
#'
#' @param n sample size
#' @param p number of variables
#' @param mu mean vector
#' @param sd diagonal of the covariance matrix
#'
#' @return normal matrix
#' @export
#'
#' @examples norm_mat(1e2, 3, -1:1, 1:3)
norm_mat <- function(n = 1, p = 1, mu = rep(0, p), sd = rep(1, p)){

  res <- matrix(NA, n, p)

  for(j in 1:p){
    res[ , j] <- rnorm_n(n, mu[j], sd[j])
  }

  return(res)

}

#' Title
#'
#' @return
#' @export
#'
#' @examples
norm_mat_par <- function(){

  nworkers <- parallel::detectCores() - 1

  cl <- parallel::makeCluster(nworkers)

  doParallel::registerDoParallel(cl)

  x <- foreach::`%dopar%`(
    foreach::foreach(j = 1:5, .errorhandling='pass', .export = "norm_mat",
                     .noexport = c("rnorm_n", "rnorm1"), .packages = c("Rcpp")),
    {
      sourceCpp("src/rnorm_c.cpp")
      norm_mat()
    })

  parallel::stopCluster(cl)

  return(x)
}

This is the github repo for my MRE

在此先感谢大家花时间回复!

GitHub 存储库 rcpp-and-doparallel 提供了解决方案。

我将在这里演示我是如何修改我的包的——rnormpar 存储库中的相应提交有提交消息“已解决并行化”。

首先,我修改了标题为 rnorm_package.R 的 R 脚本,该脚本是为注册我的 cpp 函数而创建的,以反映 rcpp-and-doparallel 包的函数:

#' @keywords internal
"_PACKAGE"

# The following block is used by usethis to automatically manage
# roxygen namespace tags. Modify with care!
## usethis namespace: start
#' @useDynLib rnormpar, .registration = TRUE
#' @importFrom Rcpp sourceCpp
## usethis namespace: end
NULL

然后我使用 devtools::document() 删除了 re-generated 我的 NAMESPACE。这导致将以下行添加到 NAMESPACE:

importFrom(Rcpp,sourceCpp)
useDynLib(rnormpar, .registration = TRUE)

如果这些行已经在 NAMESPACE 中,那么前两个步骤可能是不必要的。

最后,我修改了 foreach 函数的参数,以便我的包裹被传递给工人:

norm_mat_par <- function(){

  nworkers <- parallel::detectCores() - 1

  cl <- parallel::makeCluster(nworkers)

  doParallel::registerDoParallel(cl)

  x <- foreach::`%dopar%`(
    foreach::foreach(j = 1:5, .packages = "rnormpar"),
    {
      norm_mat()
    })

  parallel::stopCluster(cl)

  return(x)
}

构建包后,该函数产生预期的输出:

Restarting R session...

> library(rnormpar)
> rnormpar::norm_mat_par()
[[1]]
          [,1]
[1,] -1.948502

[[2]]
           [,1]
[1,] -0.2774582

[[3]]
          [,1]
[1,] 0.1710537

[[4]]
         [,1]
[1,] 1.784761

[[5]]
           [,1]
[1,] -0.5694733