RStudio 错误 - 创建大型环境对象:protect():保护堆栈溢出
RStudio error - creating large environment object: protect(): protection stack overflow
我想创建键值对的大型查找 table,尝试这样做:
# actual use case is length ~5 million
key <- do.call(paste0, Map(stringi::stri_rand_strings, n=2e5, length = 16))
val <- sample.int(750, size = 2e5, replace = T)
make_dict <- function(keys, values){
require(rlang)
e <- new.env(size = length(keys))
l <- list2(!!!setNames(values, keys))
list2env(l, envir = e, hash = T) # problem in here...?
}
d <- make_dict(key, val)
问题
当 make_dict
为 运行 时,它抛出 Error: protect(): protection stack overflow
。具体在RStudio中,当输入是一个长度大于49991的向量时,这似乎与this Whosebug post.[=29非常相似=]
然而,当我 运行 访问器函数获取一些值时,似乎 make_dict
运行 毕竟很好,因为我找不到它的任何奇怪之处结果:
`%||%` <- function(x,y) if(is.null(x)) y else x
grab <- function(...){
vector("integer", length(..2)) |>
(\(.){. = Vectorize(\(e, x) e[[x]] %||% NA_integer_, list("x"), T, F)(..1, ..2); .})()
}
out <- vector("integer", length(key))
out <- grab(d, sample(key)) # using sample to scramble the keys
anyNA(out) | !lobstr::obj_size(out) == lobstr::obj_size(val)
[1] FALSE
运行 RGui 中的相同代码不会抛出错误。
怪癖
- 对于大小 > 5e4,
d
环境对象不会出现在 RStudio 的环境窗格中。
- R 控制台 returns 迅速返回到 >(表示函数已完成),但在抛出错误之前没有响应
- 如果manually setting
options(expressions = 5e5)
会抛出错误,或者保留默认值5000
- 何时抛出错误与输入向量的大小成正比
tryCatch(make_dict(key, val), error = function(e) e)
没有发现错误
- 如果代码是 运行 来自包(通过
remotes::install_github("D-Se/minimal")
提供的打包版本),也会出现此错误
问题
这是怎么回事?如何解决此类错误?
options(error = traceback)
建议 here didn't give any results. Inserting a browser()
after list2env
in the make_dict
function throws an error long after the browser has opened. A traceback()
gives the function .rs.describeObject
, which is used to generate the summary in the Environment pane, and can be found here.
traceback()
# .rs.describeObject
(function (env, objName, computeSize = TRUE)
{
obj <- get(objName, env)
hasNullPtr <- .Call("rs_hasExternalPointer", obj, TRUE, PACKAGE = "(embedding)")
if (hasNullPtr) {
val <- "<Object with null pointer>"
desc <- "An R object containing a null external pointer"
size <- 0
len <- 0
}
else {
val <- "(unknown)"
desc <- ""
size <- if (computeSize)
object.size(obj)
else 0
len <- length(obj)
}
class <- .rs.getSingleClass(obj)
contents <- list()
contents_deferred <- FALSE
if (is.language(obj) || is.symbol(obj)) {
val <- deparse(obj)
}
else if (!hasNullPtr) {
if (size > 524288) {
len_desc <- if (len > 1)
paste(len, " elements, ", sep = "")
else ""
if (is.data.frame(obj)) {
val <- "NO_VALUE"
desc <- .rs.valueDescription(obj)
}
else {
val <- paste("Large ", class, " (", len_desc,
format(size, units = "auto", standard = "SI"),
")", sep = "")
}
contents_deferred <- TRUE
}
else {
val <- .rs.valueAsString(obj)
desc <- .rs.valueDescription(obj)
if (class == "data.table" || class == "ore.frame" ||
class == "cast_df" || class == "xts" || class ==
"DataFrame" || is.list(obj) || is.data.frame(obj) ||
isS4(obj)) {
if (computeSize) {
contents <- .rs.valueContents(obj)
}
else {
val <- "NO_VALUE"
contents_deferred <- TRUE
}
}
}
}
list(name = .rs.scalar(objName), type = .rs.scalar(class),
clazz = c(class(obj), typeof(obj)), is_data = .rs.scalar(is.data.frame(obj)),
value = .rs.scalar(val), description = .rs.scalar(desc),
size = .rs.scalar(size), length = .rs.scalar(len), contents = contents,
contents_deferred = .rs.scalar(contents_deferred))
})(<environment>, "d", TRUE)
@technocrat 指出的这个 github issue 讨论了 RStudio 早期版本中禁用 空外部指针检查 的一个已知错误,此后已通过添加解决.rs.describeObject()
的
中的额外偏好检查
.rs.readUiPref("check_null_external_pointers")
检查代码是否来自RStudio 运行,如果该版本低于某个版本号之前的版本(这里我使用当前的官方版本),可以在函数,或在包的 .OnAttach
中:
if(!is.na(Sys.getenv("RSTUDIO", unset = NA)) && .rs.api.versionInfo()$version < "2021.9.1.372")){
# warning or action
}
我想创建键值对的大型查找 table,尝试这样做:
# actual use case is length ~5 million
key <- do.call(paste0, Map(stringi::stri_rand_strings, n=2e5, length = 16))
val <- sample.int(750, size = 2e5, replace = T)
make_dict <- function(keys, values){
require(rlang)
e <- new.env(size = length(keys))
l <- list2(!!!setNames(values, keys))
list2env(l, envir = e, hash = T) # problem in here...?
}
d <- make_dict(key, val)
问题
当 make_dict
为 运行 时,它抛出 Error: protect(): protection stack overflow
。具体在RStudio中,当输入是一个长度大于49991的向量时,这似乎与this Whosebug post.[=29非常相似=]
然而,当我 运行 访问器函数获取一些值时,似乎 make_dict
运行 毕竟很好,因为我找不到它的任何奇怪之处结果:
`%||%` <- function(x,y) if(is.null(x)) y else x
grab <- function(...){
vector("integer", length(..2)) |>
(\(.){. = Vectorize(\(e, x) e[[x]] %||% NA_integer_, list("x"), T, F)(..1, ..2); .})()
}
out <- vector("integer", length(key))
out <- grab(d, sample(key)) # using sample to scramble the keys
anyNA(out) | !lobstr::obj_size(out) == lobstr::obj_size(val)
[1] FALSE
运行 RGui 中的相同代码不会抛出错误。
怪癖
- 对于大小 > 5e4,
d
环境对象不会出现在 RStudio 的环境窗格中。 - R 控制台 returns 迅速返回到 >(表示函数已完成),但在抛出错误之前没有响应
- 如果manually setting
options(expressions = 5e5)
会抛出错误,或者保留默认值5000 - 何时抛出错误与输入向量的大小成正比
tryCatch(make_dict(key, val), error = function(e) e)
没有发现错误- 如果代码是 运行 来自包(通过
remotes::install_github("D-Se/minimal")
提供的打包版本),也会出现此错误
问题
这是怎么回事?如何解决此类错误?
options(error = traceback)
建议 here didn't give any results. Inserting a browser()
after list2env
in the make_dict
function throws an error long after the browser has opened. A traceback()
gives the function .rs.describeObject
, which is used to generate the summary in the Environment pane, and can be found here.
traceback()
# .rs.describeObject
(function (env, objName, computeSize = TRUE)
{
obj <- get(objName, env)
hasNullPtr <- .Call("rs_hasExternalPointer", obj, TRUE, PACKAGE = "(embedding)")
if (hasNullPtr) {
val <- "<Object with null pointer>"
desc <- "An R object containing a null external pointer"
size <- 0
len <- 0
}
else {
val <- "(unknown)"
desc <- ""
size <- if (computeSize)
object.size(obj)
else 0
len <- length(obj)
}
class <- .rs.getSingleClass(obj)
contents <- list()
contents_deferred <- FALSE
if (is.language(obj) || is.symbol(obj)) {
val <- deparse(obj)
}
else if (!hasNullPtr) {
if (size > 524288) {
len_desc <- if (len > 1)
paste(len, " elements, ", sep = "")
else ""
if (is.data.frame(obj)) {
val <- "NO_VALUE"
desc <- .rs.valueDescription(obj)
}
else {
val <- paste("Large ", class, " (", len_desc,
format(size, units = "auto", standard = "SI"),
")", sep = "")
}
contents_deferred <- TRUE
}
else {
val <- .rs.valueAsString(obj)
desc <- .rs.valueDescription(obj)
if (class == "data.table" || class == "ore.frame" ||
class == "cast_df" || class == "xts" || class ==
"DataFrame" || is.list(obj) || is.data.frame(obj) ||
isS4(obj)) {
if (computeSize) {
contents <- .rs.valueContents(obj)
}
else {
val <- "NO_VALUE"
contents_deferred <- TRUE
}
}
}
}
list(name = .rs.scalar(objName), type = .rs.scalar(class),
clazz = c(class(obj), typeof(obj)), is_data = .rs.scalar(is.data.frame(obj)),
value = .rs.scalar(val), description = .rs.scalar(desc),
size = .rs.scalar(size), length = .rs.scalar(len), contents = contents,
contents_deferred = .rs.scalar(contents_deferred))
})(<environment>, "d", TRUE)
@technocrat 指出的这个 github issue 讨论了 RStudio 早期版本中禁用 空外部指针检查 的一个已知错误,此后已通过添加解决.rs.describeObject()
的
.rs.readUiPref("check_null_external_pointers")
检查代码是否来自RStudio 运行,如果该版本低于某个版本号之前的版本(这里我使用当前的官方版本),可以在函数,或在包的 .OnAttach
中:
if(!is.na(Sys.getenv("RSTUDIO", unset = NA)) && .rs.api.versionInfo()$version < "2021.9.1.372")){
# warning or action
}