如何使用 Rcpp 来加速 for 循环?

How to use Rcpp to speed up a for loop?

我创建了一个 for 循环,我想使用 Rcpp 库加快它的速度。我不太熟悉 C++。你能帮我加快我的功能吗? 感谢您的帮助!

我已经将我的算法、代码、输入和输出以及会话信息包括在内。

这是我的算法:

如果当前价格高于前一个价格,则在名为 TR 的列中标记 (+1)

如果当前价格低于前一个价格,则在名为 TR 的列中标记 (-1)

如果当前价格与之前的价格相同, 在名为 TR

的列中标记与先前价格相同的内容

这是我的代码:

price <- c(71.91, 71.82, 71.81, 71.81, 71.81, 71.82, 71.81, 71.81, 71.81, 
           71.82, 71.81, 71.81, 71.8, 71.81, 71.8, 71.81, 71.8, 71.8, 71.8, 
           71.8, 71.81, 71.81, 71.81, 71.81, 71.81, 71.81, 71.81, 71.81, 
           71.81, 71.82, 71.81, 71.81, 71.81, 71.81, 71.81, 71.81, 71.8, 
           71.8, 71.81, 71.81, 71.81, 71.81, 71.82, 71.82, 71.81, 71.81, 
           71.81, 71.81, 71.81, 71.81, 71.81, 71.82, 71.82, 71.82, 71.82, 
           71.82, 71.82, 71.82, 71.81, 71.82, 71.82, 71.82, 71.82, 71.82, 
           71.82, 71.82, 71.82, 71.82, 71.81, 71.81, 71.81, 71.82, 71.82, 
           71.81, 71.82, 71.82, 71.82, 71.81, 71.82, 71.82, 71.82, 71.81, 
           71.81, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 
           71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 
           71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 
           71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 
           71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 
           71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 71.81, 
           71.82, 71.82, 71.82, 71.82, 71.83, 71.82, 71.82, 71.82, 71.81, 
           71.81, 71.81, 71.81, 71.81, 71.81, 71.81, 71.82, 71.82, 71.82, 
           71.81, 71.81, 71.81, 71.82, 71.82, 71.82, 71.82, 71.82, 71.82, 
           71.82, 71.82, 71.82, 71.82, 71.82, 71.83, 71.83, 71.83, 71.83, 
           71.83, 71.83, 71.83, 71.83, 71.83, 71.83, 71.83, 71.83, 71.83, 
           71.83, 71.83, 71.83, 71.83, 71.83, 71.83, 71.83, 71.83, 71.83, 
           71.83, 71.83, 71.83, 71.83, 71.83, 71.83, 71.83, 71.83, 71.83, 
           71.83)

TR <- numeric(length(price)-1)
TR <- c(NA,TR)

for (i in 1: (length(price)-1)){

  if (price[i] == price[i+1]) {TR[i+1] = TR[i]}

  if (price[i] < price[i+1]) {TR[i+1] = 1}

  if (price[i] > price[i+1]) {TR[i+1] = -1}

}

这是我的输出: dput(TR) 产量

c(NA, -1, -1, -1, -1, 1, -1, -1, -1, 1, -1, -1, -1, 1, -1, 1, 
  -1, -1, -1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1, -1, -1, -1, 
  -1, -1, -1, -1, 1, 1, 1, 1, 1, 1, -1, -1, -1, -1, -1, -1, -1, 
  1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1, -1, -1, 
  1, 1, -1, 1, 1, 1, -1, 1, 1, 1, -1, -1, 1, 1, 1, 1, 1, 1, 1, 
  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
  1, 1, 1, -1, 1, 1, 1, 1, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 
  -1, 1, 1, 1, -1, -1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
  1, 1, 1, 1, 1, 1, 1, 1, 1, 1)

这是我的会话信息:

> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.4

loaded via a namespace (and not attached):
[1] chron_2.3-45  plyr_1.8.1    Rcpp_0.11.1   reshape2_1.4  stringr_0.6.2 tools_3.1.2  

你可以直接翻译 for 循环:

library(Rcpp)
cppFunction(
"IntegerVector proc(NumericVector x) {
  const int n = x.size();
  IntegerVector y(n);
  y[0] = NA_INTEGER;
  for (int i=1; i < n; ++i) {
    if (x[i] == x[i-1]) y[i] = y[i-1];
    else if (x[i] > x[i-1]) y[i] = 1;
    else y[i] = -1;
  }
  return y;
}")

与往常一样,与基础 R 中的 for 循环相比,使用 Rcpp 可以获得相当大的加速:

proc.for <- function(price) {
  TR <- numeric(length(price)-1)
  TR <- c(NA,TR)
  for (i in 1: (length(price)-1)){
    if (price[i] == price[i+1]) {TR[i+1] = TR[i]}
    if (price[i] < price[i+1]) {TR[i+1] = 1}
    if (price[i] > price[i+1]) {TR[i+1] = -1}
  }
  return(TR)
}
proc.aaron <- function(price) {
  change <- sign(diff(price))
  good <- change != 0
  goodval <- change[good]
  c(NA, goodval[cumsum(good)])
}
proc.jbaums <- function(price) {
  TR <- sign(diff(price))
  TR[TR==0] <- TR[which(TR != 0)][findInterval(which(TR == 0), which(TR != 0))]
  TR
}

all.equal(proc(price), proc.for(price), proc.aaron(price), proc.jbaums(price))
# [1] TRUE
library(microbenchmark)
microbenchmark(proc(price), proc.for(price), proc.aaron(price), proc.jbaums(price))
# Unit: microseconds
#                expr     min       lq      mean   median       uq      max neval
#         proc(price)   1.871   2.5380   3.92111   3.1110   4.5880   15.318   100
#     proc.for(price) 408.200 448.2830 542.19766 484.1265 546.3255 1821.104   100
#   proc.aaron(price)  23.916  25.5770  33.53259  31.5420  35.8575  190.372   100
#  proc.jbaums(price)  33.536  38.8995  46.80109  43.4510  49.3555  112.306   100

与 for 循环相比,我们看到加速超过 100 倍,与所提供向量的向量化替代方案相比,加速超过 10 倍。

对于更大的向量(此处测试长度为 100 万),加速甚至更显着:

price.big <- rep(price, times=5000)
all.equal(proc(price.big), proc.for(price.big), proc.aaron(price.big), proc.jbaums(price.big))
# [1] TRUE
microbenchmark(proc(price.big), proc.for(price.big), proc.aaron(price.big), proc.jbaums(price.big))
# Unit: milliseconds
#                    expr         min          lq        mean      median          uq        max neval
#         proc(price.big)    1.442119    1.818494    5.094274    2.020437    2.771903   56.54321   100
#     proc.for(price.big) 2639.819536 2699.493613 2949.962241 2781.636460 3062.277930 4472.35369   100
#   proc.aaron(price.big)   91.499940   99.859418  132.519296  140.521212  147.462259  207.72813   100
#  proc.jbaums(price.big)  117.242451  138.528214  170.989065  170.606048  180.337074  487.13615   100

现在,与 for 循环相比,我们的速度提高了 1000 倍,与矢量化 R 函数相比,速度提高了约 70 倍。即使在这种大小下,如果只调用一次函数,也不清楚 Rcpp 是否优于矢量化 R 解决方案,因为编译 Rcpp 代码肯定至少需要 100 毫秒。如果这是一段在您的分析中重复调用的代码,那么加速非常有吸引力。

这可以在 R 中轻松矢量化。

例如 difffindInterval:

TR <- sign(diff(price))
TR[TR==0] <- TR[which(TR != 0)][findInterval(which(TR == 0), which(TR != 0))]

也许先尝试矢量化。尽管这可能不如 Rcpp 快,但它更直接。

f2 <- function(price) {
    change <- sign(diff(price))
    good <- change != 0
    goodval <- change[good]
    c(NA, goodval[cumsum(good)])
}

而且它仍然比 R for 循环有很大的加速。

f1 <- function(price) {
    TR <- numeric(length(price)-1)
    TR <- c(NA,TR)
    for (i in 1: (length(price)-1)){
        if (price[i] == price[i+1]) {TR[i+1] = TR[i]}
        if (price[i] < price[i+1]) {TR[i+1] = 1}
        if (price[i] > price[i+1]) {TR[i+1] = -1}
    }
    TR
}

microbenchmark(f1(price), f2(price), times=100)
## Unit: microseconds
##       expr     min       lq      mean   median       uq      max neval cld
##  f1(price) 550.037 592.9830 756.20095 618.7910 703.8335 3042.530   100   b
##  f2(price)  36.915  39.3285  56.45267  45.5225  60.1965  184.536   100  a 

你可以试试字节编译。查看使用与 Rcpp 代码相同的 if-else-if-else 逻辑的 R 循环也很有用。使用 R 3.1.2 我得到

f1 <- function(price) {
    TR <- numeric(length(price)-1)
    TR <- c(NA,TR)
    for (i in 1: (length(price)-1)){
        if (price[i] == price[i+1]) {TR[i+1] = TR[i]}
        if (price[i] < price[i+1]) {TR[i+1] = 1}
        if (price[i] > price[i+1]) {TR[i+1] = -1}
    }
    return(TR)
}

f2 <- function(price) {
    TR <- numeric(length(price)-1)
    TR <- c(NA,TR)
    for (i in 1: (length(price)-1)){
        if (price[i] == price[i+1]) {TR[i+1] = TR[i]}
        else if (price[i] < price[i+1]) {TR[i+1] = 1}
        else {TR[i+1] = -1}
    }
    return(TR)
}

library(compiler)
f1c <- cmpfun(f1)
f2c <- cmpfun(f2)

library(microbenchmark)
microbenchmark(f1(price), f2(price), f1c(price), f2c(price), times = 1000)
## Unit: microseconds
##       expr     min       lq     mean   median       uq       max neval  cld
##  f1(price) 536.619 570.3715 667.3520 586.2465 609.9280 45046.462  1000    d
##  f2(price) 328.592 351.2070 386.5895 365.0245 381.4850  1302.497  1000   c 
## f1c(price) 167.570 182.4645 218.9537 192.4780 204.7810  7843.291  1000  b  
## f2c(price)  96.644 107.4465 124.1324 113.5470 121.5365  1019.389  1000 a   

R-devel 将于 4 月作为 R 3.2.0 发布,在用于此类标量计算的字节码引擎中有许多改进;我得到了

microbenchmark(f1(price), f2(price), f1c(price), f2c(price), times = 1000)
## Unit: microseconds
##        expr     min       lq      mean   median       uq      max neval  cld
##   f1(price) 490.300 520.3845 559.19539 533.2050 548.6850 1330.219  1000    d
##   f2(price) 298.375 319.7475 348.71384 330.4535 342.6405 1813.113  1000   c 
##  f1c(price)  61.947  66.3255  68.01555  67.7270  69.5470  138.308  1000  b  
##  f2c(price)  36.334  38.9500  40.45085  40.1830  41.8610   55.909  1000 a   

这会让您进入与本示例中的矢量化解决方案相同的一般范围。字节码引擎仍有进一步改进的空间,应该会在未来的版本中实现。

所有解决方案在处理 NA/NaN 值的方式上有所不同,这对您来说可能重要也可能不重要。