attr(*, "internal.selfref")=<externalptr> 出现在 data.table Rstudio

attr(*, "internal.selfref")=<externalptr> appearing in data.table Rstudio

我是 R data.table 包的新用户,我注意到我的 data.tables 有一些不寻常的地方,我没有在文档或本网站的其他地方找到解释。

在 Rstudio 中使用 data.table package 并在 'Environment' 面板中查看特定 data.table 时,我看到以下字符串出现在 data.table

attr(*,"internal.selref")=<externalptr>

如果我在控制台中打印相同的 data.table,则不会出现此字符串。

这是一个错误,还是 data.table(或 Rstudio)的固有功能?我是否应该担心这是否会影响下游流程处理这些数据的方式?

我是运行的版本如下:
data.table 版本 1.9.6 Rstudio 版本 0.99.447 OSX 10.10.5

如果这只是我一个无知的新手,请提前道歉。

我刚才问过 data.table 包的主要作者 Matt Dowle,这个问题就在不久前。

Is this a bug, or just an inherent feature of data.table (or Rstudio)?

显然这个属性由 data.table 内部使用,它不是 RStudio 中的错误,事实上 RStudio 正在做它显示对象属性的工作。

Should I be concerned about whether this is affecting how these data are handled by downstream processes?

不,这不会影响任何事情。

对于那些对为什么创建这个属性感到好奇的人,我相信它在 setkey():

部分下的 data.table manual 中有解释

In v1.7.8, the key<- syntax was deprecated. The <- method copies the whole table and we know of no way to avoid that copy without a change in R itself. Please use the set* functions instead, which make no copy at all. setkey accepts unquoted column names for convenience, whilst setkeyv accepts one vector of column names. The problem (for data.table) with the copy by key<- (other than being slower) is that R doesn’t maintain the over allocated truelength, but it looks as though it has. Adding a column by reference using := after a key<- was therefore a memory overwrite and eventually a segfault; the over allocated memory wasn’t really there after key<-’s copy. data.tables now have an attribute .internal.selfref to catch and warn about such copies. This attribute has been implemented in a way that is friendly with identical() and object.size(). For the same reason, please use the other set* functions which modify objects by reference, rather than using the <- operator which results in copying the entire object.