来自大稀疏矩阵的 Rcpp submat

Question

我正在尝试将一个 vec 乘以一个非常大的稀疏矩阵的子集（如下脚本所示），但是在使用 sourceCpp 时无法编译，它报告 error: no matching function for call to ‘arma::SpMat<double>::submat(arma::uvec&, arma::uvec&)，它如果有人能帮我一个忙，我将不胜感激。

#include <RcppArmadillo.h>

// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::depends(RcppArmadillo)]]

// [[Rcpp::export]]
double myf(sp_mat X, vec g, uvec xi){
    double u = g(xi).t() * X.submat(xi, xi) * g(xi);
    return u;
}

Answer 1

所以，正如@RalfStubner 提到的，matrix access for sparse matrices is continuous only. That said, the access approach taken is symmetric for the actual sparse matrix since the same index is being used. So, in this case, it makes sense to revert back to a standard element accessor of (x,y)。结果，总和减少可以用一个循环完成。

#include <RcppArmadillo.h>

// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::depends(RcppArmadillo)]]

// [[Rcpp::export]]
double submat_multiply(const arma::sp_mat& X, 
                       const arma::vec& g, const arma::uvec& xi){

  // Add an assertion
  if(X.n_rows != g.n_elem) {
    Rcpp::stop("Mismatched row and column dimensions of X (%s) and g (%s).",
               X.n_rows, g.n_elem);
  }

  // Reduction
  double summed = 0; 

  for (unsigned int i = 0; i < xi.n_elem; ++i) {
    // Retrieve indexing element
    arma::uword index_at_i = xi(i);
    // Add components together
    summed += g(index_at_i) * X(index_at_i, index_at_i) * g(index_at_i);
  }

  // Return result
  return summed;
}

另一种可能成本更高的方法是提取稀疏矩阵的对角线并将其转换为密集向量。从那里应用逐元素乘法和求和。

#include <RcppArmadillo.h>

// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::depends(RcppArmadillo)]]

// [[Rcpp::export]]
double submat_multiply_v2(const arma::sp_mat& X, 
                          const arma::vec& g, const arma::uvec& xi){

  // Add an assertion
  if(X.n_rows != g.n_elem) {
    Rcpp::stop("Mismatched row and column dimensions of X (%s) and g (%s).",
               X.n_rows, g.n_elem);
  }

  // Copy sparse diagonal to dense vector
  arma::vec x_diag(X.diag());
  // Obtain the subset
  arma::vec g_sub = g.elem(xi); 

  // Perform element-wise multiplication and then sum.
  double summed = arma::sum(g_sub % x_diag.elem(xi) % g_sub);

  // Return result
  return summed;
}

测试代码：

# Sparse matrix
library(Matrix)
i <- c(1,4:8,10); j <- c(2, 9, 6:10); x <- 7 * (1:7)
X <- sparseMatrix(i, j, x = x)  

X
# 10 x 10 sparse Matrix of class "dgCMatrix"
#                               
#  [1,] . 7 . . .  .  .  .  .  .
#  [2,] . . . . .  .  .  .  .  .
#  [3,] . . . . .  .  .  .  .  .
#  [4,] . . . . .  .  .  . 14  .
#  [5,] . . . . . 21  .  .  .  .
#  [6,] . . . . .  . 28  .  .  .
#  [7,] . . . . .  .  . 35  .  .
#  [8,] . . . . .  .  .  . 42  .
#  [9,] . . . . .  .  .  .  .  .
# [10,] . . . . .  .  .  .  . 49

# Vector
g <- 1:10

# Indices
xi <- c(0, 3, 4, 9)

# Above function
submat_multiply(X, g, xi)
# [1] 4900

submat_multiply_v2(X, g, xi)
# [1] 4900

来自大稀疏矩阵的 Rcpp submat

Rcpp submat from a big sparse matrix

r

sparse-matrix

armadillo

rcpp

rcpparmadillo