Rcppparallel bootstrap
Rcppparallel bootstrap
我假设,或者更确切地说,我希望,我有一个单一的可解决的问题,或者可能是许多较小的问题,应该放弃。无论哪种方式,我都是 Rcpp 的新手,对并行计算非常不了解,无法在线找到解决方案。
问题通常是,R 或 R 中的 'fatal error' 陷入循环,大约 5 分钟进行 10 次迭代,而非并行版本将同时进行 5K 次迭代,大致正在发言。
由于该算法适用于一个更大的项目,我调用了其他几个函数,这些都在 Rcpp 中,我只用 'arma' 个对象重写了它们,因为这似乎可以帮助其他人,在这里。我还 运行 使用我在 Rcpp 中编写的 'heat map' 优化器的优化部分,再次完全在 'arma' 中没有改进 - 我还应该指出这作为 'arma::vec' 返回。
// [[Rcpp::depends("RcppArmadillo")]]
// [[Rcpp::depends("RcppParallel")]]
#include <RcppArmadillo.h>
#include <RcppParallel.h>
using namespace Rcpp;
using namespace std;
using namespace arma;
using namespace RcppParallel;
struct Boot_Worker : public Worker {
//Generate Inputs
// Source vector to keep track of the number of bootstraps
const arma::vec Boot_reps;
// Initial non-linear theta parameter values
const arma::vec init_val;
// Decimal date vector
const arma::colvec T_series;
// Generate the price series observational vector
const arma::colvec Y_est;
const arma::colvec Y_res;
// Generate the optimization constants
const arma::mat U;
const arma::colvec C;
const int N;
// Generate Output Matrix
arma::mat Boots_out;
// Initialize with the proper input and output
Boot_Worker( const arma::vec Boot_reps, const arma::vec init_val, const arma::colvec T_series, const arma::colvec Y_est, const arma::colvec Y_res, const arma::mat U, const arma::colvec C, const int N, arma::mat Boots_out)
: Boot_reps(Boot_reps), init_val(init_val), T_series(T_series), Y_est(Y_est), Y_res(Y_res), U(U), C(C), N(N), Boots_out(Boots_out) {}
void operator()(std::size_t begin, std::size_t end){
//load necessary stuffs from around
Rcpp::Environment stats("package:stats");
Rcpp::Function constrOptim = stats["constrOptim"];
Rcpp::Function SDK_pred_mad( "SDK_pred_mad");
arma::mat fake_data(N,2);
arma::colvec index(N);
for(unsigned int i = begin; i < end; i ++){
// Need a nested loop to create and fill the fake data matrix
arma::vec pool = arma::regspace(0, N-1) ;
std::random_shuffle(pool.begin(), pool.end());
for(int k = 0; k <= N-1; k++){
fake_data(k, 0) = Y_est[k] + Y_res[ pool[k] ];
fake_data(k, 1) = T_series[k];
}
// Call the optimization
Rcpp::List opt_results = constrOptim(Rcpp::_["theta"] = init_val,
Rcpp::_["f"] = SDK_pred_mad,
Rcpp::_["data_in"] = fake_data,
Rcpp::_["grad"] = "NULL",
Rcpp::_["method"] = "Nelder-Mead",
Rcpp::_["ui"] = U,
Rcpp::_["ci"] = C );
/// fill the output matrix ///
// need to create an place holder arma vector for the parameter output
arma::vec opt_param = Rcpp::as<arma::vec>(opt_results[0]);
Boots_out(i, 0) = opt_param[0];
Boots_out(i, 1) = opt_param[1];
Boots_out(i, 2) = opt_param[2];
// for the cost function value at optimization
arma::vec opt_value = Rcpp::as<arma::vec>(opt_results[1]);
Boots_out(i, 3) = opt_value[0];
// for the number of function calls (?)
arma::vec counts = Rcpp::as<arma::vec>(opt_results[2]);
Boots_out(i, 4) = counts[0];
// for thhe convergence code
arma::vec convergence = Rcpp::as<arma::vec>(opt_results[3]);
Boots_out(i, 5) = convergence[0];
}
}
};
// [[Rcpp::export]]
arma::mat SDK_boots_test(arma::vec init_val, arma::mat data_in, int boots_n){
//First establish theta_sp, estimate and residuals
const int N = arma::size(data_in)[0];
// Create the constraints for the constrained optimization
// Make a boundry boundry condition matrix of the form Ui*theta - ci >= 0
arma::mat U(6, 3);
U(0, 0) = 1;
U(1, 0) = -1;
U(2, 0) = 0;
U(3, 0) = 0;
U(4, 0) = 0;
U(5, 0) = 0;
U(0, 1) = 0;
U(1, 1) = 0;
U(2, 1) = 1;
U(3, 1) = -1;
U(4, 1) = 0;
U(5, 1) = 0;
U(0, 2) = 0;
U(1, 2) = 0;
U(2, 2) = 0;
U(3, 2) = 0;
U(4, 2) = 1;
U(5, 2) = -1;
arma::colvec C(6);
C[0] = 0;
C[1] = -data_in(N-1, 9)-0.5;
C[2] = 0;
C[3] = -3;
C[4] = 0;
C[5] = -50;
Rcpp::Function SDK_est( "SDK_est");
Rcpp::Function SDK_res( "SDK_res");
arma::vec Y_est = as<arma::vec>(SDK_est(init_val, data_in));
arma::vec Y_res = as<arma::vec>(SDK_res(init_val, data_in));
// Generate feed items for the Bootstrap Worker
arma::vec T_series = data_in( span(0, N-1), 9);
arma::vec Boots_reps(boots_n+1);
// Allocate the output matrix
arma::mat Boots_out(boots_n, 6);
// Pass input and output the Bootstrap Worker
Boot_Worker Boot_Worker(Boots_reps, init_val, T_series, Y_est, Y_res, U, C, N, Boots_out);
// Now finnaly call the parallel for loop
parallelFor(0, Boots_reps.size(), Boot_Worker);
return Boots_out;
}
所以我在我的 'heat algorithm' 中回复了解决优化的问题,这完全在 Rcpp-armadillo 中,这大大简化了代码,因为约束被写入优化器。此外,我删除了运行domization,所以它只需要解决相同的优化;只是为了看看这是不是唯一的问题。毫无疑问,我仍然拥有相同的 'fatal error'.
这里的代码是:
// [[Rcpp::depends("RcppArmadillo")]]
// [[Rcpp::depends("RcppParallel")]]
#include <RcppArmadillo.h>
#include <RcppParallel.h>
#include <random>
using namespace Rcpp;
using namespace std;
using namespace arma;
using namespace RcppParallel;
struct Boot_Worker : public Worker {
//Generate Inputs
// Source vector to keep track of the number of bootstraps
const arma::vec Boot_reps;
// Initial non-linear theta parameter values
const arma::vec init_val;
// Decimal date vector
const arma::colvec T_series;
// Generate the price series observational vector
const arma::colvec Y_est;
const arma::colvec Y_res;
const int N;
// Generate Output Matrix
arma::mat Boots_out;
// Initialize with the proper input and output
Boot_Worker( const arma::vec Boot_reps, const arma::vec init_val, const arma::colvec T_series, const arma::colvec Y_est, const arma::colvec Y_res, const int N, arma::mat Boots_out)
: Boot_reps(Boot_reps), init_val(init_val), T_series(T_series), Y_est(Y_est), Y_res(Y_res), N(N), Boots_out(Boots_out) {}
void operator()(std::size_t begin, std::size_t end){
//load necessary stuffs from around
Rcpp::Function SDK_heat( "SDK_heat");
arma::mat fake_data(N,2);
arma::colvec index(N);
for(unsigned int i = begin; i < end; i ++){
// Need a nested loop to create and fill the fake data matrix
//arma::vec pool = arma::shuffle( arma::regspace(0, N-1) );
for(int k = 0; k <= N-1; k++){
fake_data(k, 0) = Y_est[k] + Y_res[ k ];
//fake_data(k, 0) = Y_est[k] + Y_res[ pool[k] ];
fake_data(k, 1) = T_series[k];
}
// Call the optimization
arma::vec opt_results = Rcpp::as<arma::vec>( SDK_heat(Rcpp::_["data_in"] = fake_data, Rcpp::_["tol"] = 0.1) );
/// fill the output matrix ///
// need to create an place holder arma vector for the parameter output
Boots_out(i, 0) = opt_results[0];
Boots_out(i, 1) = opt_results[1];
Boots_out(i, 2) = opt_results[2];
// for the cost function value at optimization
Boots_out(i, 3) = opt_results[3];
}
}
};
// [[Rcpp::export]]
arma::mat SDK_boots_test(arma::vec init_val, arma::mat data_in, int boots_n){
//First establish theta_sp, estimate and residuals
const int N = arma::size(data_in)[0];
Rcpp::Function SDK_est( "SDK_est");
Rcpp::Function SDK_res( "SDK_res");
const arma::vec Y_est = as<arma::vec>(SDK_est(init_val, data_in));
const arma::vec Y_res = as<arma::vec>(SDK_res(init_val, data_in));
// Generate feed items for the Bootstrap Worker
const arma::vec T_series = data_in( span(0, N-1), 9);
arma::vec Boots_reps(boots_n+1);
// Allocate the output matrix
arma::mat Boots_out(boots_n, 4);
// Pass input and output the Bootstrap Worker
Boot_Worker Boot_Worker(Boots_reps, init_val, T_series, Y_est, Y_res, N, Boots_out);
// Now finnaly call the parallel for loop
parallelFor(0, Boots_reps.size(), Boot_Worker);
return Boots_out;
}
查看您的代码,我看到以下内容:
struct Boot_Worker : public Worker {
[...]
void operator()(std::size_t begin, std::size_t end){
//load necessary stuffs from around
Rcpp::Environment stats("package:stats");
Rcpp::Function constrOptim = stats["constrOptim"];
Rcpp::Function SDK_pred_mad( "SDK_pred_mad");
[...]
// Call the optimization
Rcpp::List opt_results = constrOptim(Rcpp::_["theta"] = init_val,
Rcpp::_["f"] = SDK_pred_mad,
Rcpp::_["data_in"] = fake_data,
Rcpp::_["grad"] = "NULL",
Rcpp::_["method"] = "Nelder-Mead",
Rcpp::_["ui"] = U,
Rcpp::_["ci"] = C );
您正在从多线程 C++ 上下文中调用 R 函数。那是你 should not do 的东西。 R 是单线程的,因此这将导致未定义的行为或崩溃:
API Restrictions
The code that you write within parallel workers should not call the R or Rcpp API in any fashion. This is because R is single-threaded and concurrent interaction with it’s data structures can cause crashes and other undefined behavior. Here is the official guidance from Writing R Extensions:
Calling any of the R API from threaded code is ‘for experts only’: they will need to read the source code to determine if it is thread-safe. In particular, code which makes use of the stack-checking mechanism must not be called from threaded code.
此外,即使在单线程上下文中从 C++ 回调 R 也不是您可以为性能做的最好的事情。使用提供直接 C(++) 接口的优化库应该更有效。一种可能是 nlopt 的开发版本,c.f。 this issue for a discussion and references to examples. In addition, std::random_shuffle
is not only deprecated in C++14 and removed from C++17, but it is also not thread-safe.
在你的第二个例子中,你说函数 SDK_heat
实际上是用 C++ 实现的。在那种情况下你可以直接调用它:
- 去掉导入相应的R函数,即
Rcpp::Function SDK_heat( "SDK_heat");
- 确保编译器知道 C++ 函数的声明并且链接器具有实际函数:
- 快速而肮脏:将函数定义复制到
cpp
文件中,然后再定义 BootWorker
。
- 要获得更清晰的方法,请参阅 Rcpp attributes vignette
中的“1.10 共享代码”部分
- 像调用任何其他 C++ 函数一样调用该函数,即使用类型与函数声明兼容的位置参数。
所有这些都假定您正在使用 sourceCpp
,正如您对 [[Rcpp::depends(...)]]
的使用所表明的那样。您正在达到一种复杂性,可以保证从中构建一个包。
我假设,或者更确切地说,我希望,我有一个单一的可解决的问题,或者可能是许多较小的问题,应该放弃。无论哪种方式,我都是 Rcpp 的新手,对并行计算非常不了解,无法在线找到解决方案。
问题通常是,R 或 R 中的 'fatal error' 陷入循环,大约 5 分钟进行 10 次迭代,而非并行版本将同时进行 5K 次迭代,大致正在发言。
由于该算法适用于一个更大的项目,我调用了其他几个函数,这些都在 Rcpp 中,我只用 'arma' 个对象重写了它们,因为这似乎可以帮助其他人,在这里。我还 运行 使用我在 Rcpp 中编写的 'heat map' 优化器的优化部分,再次完全在 'arma' 中没有改进 - 我还应该指出这作为 'arma::vec' 返回。
// [[Rcpp::depends("RcppArmadillo")]]
// [[Rcpp::depends("RcppParallel")]]
#include <RcppArmadillo.h>
#include <RcppParallel.h>
using namespace Rcpp;
using namespace std;
using namespace arma;
using namespace RcppParallel;
struct Boot_Worker : public Worker {
//Generate Inputs
// Source vector to keep track of the number of bootstraps
const arma::vec Boot_reps;
// Initial non-linear theta parameter values
const arma::vec init_val;
// Decimal date vector
const arma::colvec T_series;
// Generate the price series observational vector
const arma::colvec Y_est;
const arma::colvec Y_res;
// Generate the optimization constants
const arma::mat U;
const arma::colvec C;
const int N;
// Generate Output Matrix
arma::mat Boots_out;
// Initialize with the proper input and output
Boot_Worker( const arma::vec Boot_reps, const arma::vec init_val, const arma::colvec T_series, const arma::colvec Y_est, const arma::colvec Y_res, const arma::mat U, const arma::colvec C, const int N, arma::mat Boots_out)
: Boot_reps(Boot_reps), init_val(init_val), T_series(T_series), Y_est(Y_est), Y_res(Y_res), U(U), C(C), N(N), Boots_out(Boots_out) {}
void operator()(std::size_t begin, std::size_t end){
//load necessary stuffs from around
Rcpp::Environment stats("package:stats");
Rcpp::Function constrOptim = stats["constrOptim"];
Rcpp::Function SDK_pred_mad( "SDK_pred_mad");
arma::mat fake_data(N,2);
arma::colvec index(N);
for(unsigned int i = begin; i < end; i ++){
// Need a nested loop to create and fill the fake data matrix
arma::vec pool = arma::regspace(0, N-1) ;
std::random_shuffle(pool.begin(), pool.end());
for(int k = 0; k <= N-1; k++){
fake_data(k, 0) = Y_est[k] + Y_res[ pool[k] ];
fake_data(k, 1) = T_series[k];
}
// Call the optimization
Rcpp::List opt_results = constrOptim(Rcpp::_["theta"] = init_val,
Rcpp::_["f"] = SDK_pred_mad,
Rcpp::_["data_in"] = fake_data,
Rcpp::_["grad"] = "NULL",
Rcpp::_["method"] = "Nelder-Mead",
Rcpp::_["ui"] = U,
Rcpp::_["ci"] = C );
/// fill the output matrix ///
// need to create an place holder arma vector for the parameter output
arma::vec opt_param = Rcpp::as<arma::vec>(opt_results[0]);
Boots_out(i, 0) = opt_param[0];
Boots_out(i, 1) = opt_param[1];
Boots_out(i, 2) = opt_param[2];
// for the cost function value at optimization
arma::vec opt_value = Rcpp::as<arma::vec>(opt_results[1]);
Boots_out(i, 3) = opt_value[0];
// for the number of function calls (?)
arma::vec counts = Rcpp::as<arma::vec>(opt_results[2]);
Boots_out(i, 4) = counts[0];
// for thhe convergence code
arma::vec convergence = Rcpp::as<arma::vec>(opt_results[3]);
Boots_out(i, 5) = convergence[0];
}
}
};
// [[Rcpp::export]]
arma::mat SDK_boots_test(arma::vec init_val, arma::mat data_in, int boots_n){
//First establish theta_sp, estimate and residuals
const int N = arma::size(data_in)[0];
// Create the constraints for the constrained optimization
// Make a boundry boundry condition matrix of the form Ui*theta - ci >= 0
arma::mat U(6, 3);
U(0, 0) = 1;
U(1, 0) = -1;
U(2, 0) = 0;
U(3, 0) = 0;
U(4, 0) = 0;
U(5, 0) = 0;
U(0, 1) = 0;
U(1, 1) = 0;
U(2, 1) = 1;
U(3, 1) = -1;
U(4, 1) = 0;
U(5, 1) = 0;
U(0, 2) = 0;
U(1, 2) = 0;
U(2, 2) = 0;
U(3, 2) = 0;
U(4, 2) = 1;
U(5, 2) = -1;
arma::colvec C(6);
C[0] = 0;
C[1] = -data_in(N-1, 9)-0.5;
C[2] = 0;
C[3] = -3;
C[4] = 0;
C[5] = -50;
Rcpp::Function SDK_est( "SDK_est");
Rcpp::Function SDK_res( "SDK_res");
arma::vec Y_est = as<arma::vec>(SDK_est(init_val, data_in));
arma::vec Y_res = as<arma::vec>(SDK_res(init_val, data_in));
// Generate feed items for the Bootstrap Worker
arma::vec T_series = data_in( span(0, N-1), 9);
arma::vec Boots_reps(boots_n+1);
// Allocate the output matrix
arma::mat Boots_out(boots_n, 6);
// Pass input and output the Bootstrap Worker
Boot_Worker Boot_Worker(Boots_reps, init_val, T_series, Y_est, Y_res, U, C, N, Boots_out);
// Now finnaly call the parallel for loop
parallelFor(0, Boots_reps.size(), Boot_Worker);
return Boots_out;
}
所以我在我的 'heat algorithm' 中回复了解决优化的问题,这完全在 Rcpp-armadillo 中,这大大简化了代码,因为约束被写入优化器。此外,我删除了运行domization,所以它只需要解决相同的优化;只是为了看看这是不是唯一的问题。毫无疑问,我仍然拥有相同的 'fatal error'.
这里的代码是:
// [[Rcpp::depends("RcppArmadillo")]]
// [[Rcpp::depends("RcppParallel")]]
#include <RcppArmadillo.h>
#include <RcppParallel.h>
#include <random>
using namespace Rcpp;
using namespace std;
using namespace arma;
using namespace RcppParallel;
struct Boot_Worker : public Worker {
//Generate Inputs
// Source vector to keep track of the number of bootstraps
const arma::vec Boot_reps;
// Initial non-linear theta parameter values
const arma::vec init_val;
// Decimal date vector
const arma::colvec T_series;
// Generate the price series observational vector
const arma::colvec Y_est;
const arma::colvec Y_res;
const int N;
// Generate Output Matrix
arma::mat Boots_out;
// Initialize with the proper input and output
Boot_Worker( const arma::vec Boot_reps, const arma::vec init_val, const arma::colvec T_series, const arma::colvec Y_est, const arma::colvec Y_res, const int N, arma::mat Boots_out)
: Boot_reps(Boot_reps), init_val(init_val), T_series(T_series), Y_est(Y_est), Y_res(Y_res), N(N), Boots_out(Boots_out) {}
void operator()(std::size_t begin, std::size_t end){
//load necessary stuffs from around
Rcpp::Function SDK_heat( "SDK_heat");
arma::mat fake_data(N,2);
arma::colvec index(N);
for(unsigned int i = begin; i < end; i ++){
// Need a nested loop to create and fill the fake data matrix
//arma::vec pool = arma::shuffle( arma::regspace(0, N-1) );
for(int k = 0; k <= N-1; k++){
fake_data(k, 0) = Y_est[k] + Y_res[ k ];
//fake_data(k, 0) = Y_est[k] + Y_res[ pool[k] ];
fake_data(k, 1) = T_series[k];
}
// Call the optimization
arma::vec opt_results = Rcpp::as<arma::vec>( SDK_heat(Rcpp::_["data_in"] = fake_data, Rcpp::_["tol"] = 0.1) );
/// fill the output matrix ///
// need to create an place holder arma vector for the parameter output
Boots_out(i, 0) = opt_results[0];
Boots_out(i, 1) = opt_results[1];
Boots_out(i, 2) = opt_results[2];
// for the cost function value at optimization
Boots_out(i, 3) = opt_results[3];
}
}
};
// [[Rcpp::export]]
arma::mat SDK_boots_test(arma::vec init_val, arma::mat data_in, int boots_n){
//First establish theta_sp, estimate and residuals
const int N = arma::size(data_in)[0];
Rcpp::Function SDK_est( "SDK_est");
Rcpp::Function SDK_res( "SDK_res");
const arma::vec Y_est = as<arma::vec>(SDK_est(init_val, data_in));
const arma::vec Y_res = as<arma::vec>(SDK_res(init_val, data_in));
// Generate feed items for the Bootstrap Worker
const arma::vec T_series = data_in( span(0, N-1), 9);
arma::vec Boots_reps(boots_n+1);
// Allocate the output matrix
arma::mat Boots_out(boots_n, 4);
// Pass input and output the Bootstrap Worker
Boot_Worker Boot_Worker(Boots_reps, init_val, T_series, Y_est, Y_res, N, Boots_out);
// Now finnaly call the parallel for loop
parallelFor(0, Boots_reps.size(), Boot_Worker);
return Boots_out;
}
查看您的代码,我看到以下内容:
struct Boot_Worker : public Worker {
[...]
void operator()(std::size_t begin, std::size_t end){
//load necessary stuffs from around
Rcpp::Environment stats("package:stats");
Rcpp::Function constrOptim = stats["constrOptim"];
Rcpp::Function SDK_pred_mad( "SDK_pred_mad");
[...]
// Call the optimization
Rcpp::List opt_results = constrOptim(Rcpp::_["theta"] = init_val,
Rcpp::_["f"] = SDK_pred_mad,
Rcpp::_["data_in"] = fake_data,
Rcpp::_["grad"] = "NULL",
Rcpp::_["method"] = "Nelder-Mead",
Rcpp::_["ui"] = U,
Rcpp::_["ci"] = C );
您正在从多线程 C++ 上下文中调用 R 函数。那是你 should not do 的东西。 R 是单线程的,因此这将导致未定义的行为或崩溃:
API Restrictions
The code that you write within parallel workers should not call the R or Rcpp API in any fashion. This is because R is single-threaded and concurrent interaction with it’s data structures can cause crashes and other undefined behavior. Here is the official guidance from Writing R Extensions:
Calling any of the R API from threaded code is ‘for experts only’: they will need to read the source code to determine if it is thread-safe. In particular, code which makes use of the stack-checking mechanism must not be called from threaded code.
此外,即使在单线程上下文中从 C++ 回调 R 也不是您可以为性能做的最好的事情。使用提供直接 C(++) 接口的优化库应该更有效。一种可能是 nlopt 的开发版本,c.f。 this issue for a discussion and references to examples. In addition, std::random_shuffle
is not only deprecated in C++14 and removed from C++17, but it is also not thread-safe.
在你的第二个例子中,你说函数 SDK_heat
实际上是用 C++ 实现的。在那种情况下你可以直接调用它:
- 去掉导入相应的R函数,即
Rcpp::Function SDK_heat( "SDK_heat");
- 确保编译器知道 C++ 函数的声明并且链接器具有实际函数:
- 快速而肮脏:将函数定义复制到
cpp
文件中,然后再定义BootWorker
。 - 要获得更清晰的方法,请参阅 Rcpp attributes vignette 中的“1.10 共享代码”部分
- 快速而肮脏:将函数定义复制到
- 像调用任何其他 C++ 函数一样调用该函数,即使用类型与函数声明兼容的位置参数。
所有这些都假定您正在使用 sourceCpp
,正如您对 [[Rcpp::depends(...)]]
的使用所表明的那样。您正在达到一种复杂性,可以保证从中构建一个包。