使用 all.equal(x,y) 与 all.equal(y,x) 时的顺序很重要:潜在的无限递归?

Order matters when using all.equal(x,y) vs all.equal(y,x): Potantial infinite recursion?

我试图测试不同 R Objects 的相等性,发现有时以错误的顺序比较 objects 时,会发生以下错误:

Error: C stack usage 7975620 is too close to the limit

我认为这是递归太深的标志吗?

它应该可以通过以下比较重现:

all.equal(mean, sd) # no error
all.equal(sd,mean) # Error: C stack usage 7975620 is too close to the limit
all.equal(NULL, mean) # no error
all.equal(mean,NULL) # Error: C stack usage 7975620 is too close to the limit 
all.equal(mean, sum); all.equal(sd, sum) # no Error
all.equal(sum,NULL) # no error
all.equal(sd, var) # no error
all.equal(var, mean) # Error: C stack usage 7975620 is too close to the limit
all.equal(var, NULL)  # Error: C stack usage 7975620 is too close to the limit

我知道我比较的 methods/functions 在 R 中的实现方式完全不同,并且根据给定方法的实现方式,似乎存在比较失败的模式,但是我想知道该功能是这样设计的(我找不到关于要在文档中进行比较的 objects 顺序的注释)。我也很好奇,如果有人能向我解释这种行为,我将不胜感激。

我也可以从终端在 R --vanilla 中重现这些问题。

Session 信息:

R version 4.1.3 (2022-03-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.4 LTS

Matrix products: default BLAS:
/usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils
datasets methods base

loaded via a namespace (and not attached): [1] compiler_4.1.3

编辑:在 Rstudio 服务器上尝试了代码示例,但无法重现上述行为。 all.equal 函数的输出也不同

Session 信息:

R version 4.0.3 (2020-10-10) Platform: x86_64-suse-linux-gnu (64-bit) Running under: openSUSE Leap 15.2

Matrix products: default BLAS: /usr/lib64/R/lib/libRblas.so LAPACK: /usr/lib64/R/lib/libRlapack.so

locale: [1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C
LC_TIME=de_DE.UTF-8 LC_COLLATE=de_DE.UTF-8
LC_MONETARY=de_DE.UTF-8 [6] LC_MESSAGES=de_DE.UTF-8
LC_PAPER=de_DE.UTF-8 LC_NAME=C LC_ADDRESS=C
LC_TELEPHONE=C [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils
datasets methods base

loaded via a namespace (and not attached): [1] compiler_4.0.3 tools_4.0.3

我跟踪了 all.equal(sd,mean) 的错误,它实际上源于调用 all.equal.environment(environment(sd), environment(mean), ignore.environment = FALSE)

all.equal()的文档中,我们看到环境方法有一个额外的参数evaluate,它是一个

logical indicating if “promises should be forced”

这默认为 true,似乎会导致堆栈使用问题。

要修复它,只需调用 all.equal(..., evaluate = FALSE):

all.equal(mean, sd, evaluate = FALSE)
#> [1] "target, current do not match when deparsed"                                                                                                   
#> [2] "names of environments differ: Lengths (1370, 1134) differ (string compare on first 1134) names of environments differ: 1134 string mismatches"

reprex package (v2.0.1)

于 2022-03-29 创建

结果:

all.equal(mean, sd, evaluate = FALSE) # no Error
all.equal(sd,mean, evaluate = FALSE) # no Error
all.equal(NULL, mean, evaluate = FALSE) # no Error
all.equal(mean,NULL, evaluate = FALSE) # no Error
all.equal(mean, sum, evaluate = FALSE) # no Error
all.equal(sum,NULL, evaluate = FALSE) # no Error
all.equal(sd, var, evaluate = FALSE) # no Error
all.equal(var, mean, evaluate = FALSE) # no Error
all.equal(var, NULL, evaluate = FALSE)  # no Error