微基准测试结果检查失败，data.table 已按引用更改

Question

SO 上有一些答案，其中比较了时间 而没有 检查结果。但是，我更喜欢快速查看一个表达式是否正确和。

microbenchmark 包通过 check 参数支持这一点。不幸的是，通过引用 data.table 更改的表达式检查失败，即检查无法识别结果不同。

案例 1：data.table 表达式的检查按预期工作

library(data.table)
library(microbenchmark)

# minimal data.table 1 col, 3 rows
dt <- data.table(x = c(1, 1, 10))

# define check function as in example section of help(microbenchmark)
my_check <- function(values) {
  all(sapply(values[-1], function(x) identical(values[[1]], x)))
}

基准案例旨在 return 不同的结果。因此，

microbenchmark(
  f1 = dt[, mean(x)],
  f2 = dt[, median(x)],
  check = my_check
)

return如预期的错误消息：

Error: Input expressions are not equivalent.

情况 2：data.table 检查失败的表达式

现在，表达式被修改为通过引用更改dt。请注意，使用了相同的检查功能。

microbenchmark(
  f1 = dt[, y := mean(x)],
  f2 = dt[, y := median(x)],
  check = my_check
)

return现在

 expr     min      lq     mean   median       uq     max neval cld
   f1 576.947 625.174 642.9820 640.7110 661.1870 732.391   100  a 
   f2 602.022 658.384 684.7076 678.9975 694.0825 978.600   100   b

所以，虽然两个表达式不同，但这里的结果检查失败了。（时间无关紧要。）

我了解到，由于dt被引用更改，因此检查确定失败。因此，在比较每个表达式的结果时，总是在最后一次更改的状态中引用相同的对象。

问题

如何修改检查函数 and/or 表达式，以便即使 data.table 被引用更改，检查也能可靠地检测到不同的结果？

Answer 1

最简单的方法是使用copy():

microbenchmark(
    f1 = copy(dt)[, y := mean(x)],
    f2 = copy(dt)[, y := median(x)],
    check = my_check, times=1L
)
# Error: Input expressions are not equivalent.

将 copy(dt) 添加到组合中可以了解复制所花费的时间（如有必要，可以从 f1 和 f2 的运行时间中减去该时间） .

microbenchmark(
    f1 = copy(dt)[, y := mean(x)],
    f2 = copy(dt)[, y := median(x)],
    f3 = copy(dt),
    times=10L
)
# Unit: microseconds
#  expr     min      lq     mean   median      uq     max neval cld
#    f1 298.690 306.508 331.6364 315.1400 347.788 414.264    10   b
#    f2 319.075 322.475 373.3873 329.3895 336.268 746.134    10   b
#    f3  19.180  19.750  28.3504  25.1745  26.111  70.016    10   a

微基准测试结果检查失败，data.table 已按引用更改

Check of microbenchmark results fails with data.table changed by reference

r

microbenchmark

data.table

案例 1：data.table 表达式的检查按预期工作

情况 2：data.table 检查失败的表达式

问题