当我在数据框中仅使用列名的初始部分时,为什么 R 不会抛出错误?

Why doesn't R throw an error when I use only the initial part of my column name in a data frame?

我有一个包含各种列以及 sender_bank_flag 的数据框。我 运行 对我的数据框进行以下两个查询。

sum(s_50k_sample$sender_bank_flag, na.rm=TRUE)

sum(s_50k_sample$sender_bank, na.rm=TRUE)

即使我的数据框中没有 sender_bank 这样的列,我也从两个查询中得到了相同的输出。我预计第二个代码会出错。不知道 R 有这样的功能!有谁知道这个功能到底是什么以及如何更好地利用它?

可能值得将所有评论扩充为答案。


两者 and 都指向文档页面 ?Extract:

递归(类列表)对象下:

Both "[[" and "$" select a single element of the list. The main difference is that "$" does not allow computed indices, whereas "[[" does. x$name is equivalent to x[["name", exact = FALSE]]. Also, the partial matching behavior of "[[" can be controlled using the exact argument.

字符索引下:

Character indices can in some circumstances be partially matched (see ?pmatch) to the names or dimnames of the object being subsetted (but never for subassignment). Unlike S (Becker et al p. 358), R never uses partial matching when extracting by "[", and partial matching is not by default used by "[[" (see argument exact).

Thus the default behaviour is to use partial matching only when extracting from recursive objects (except environments) by "$". Even in that case, warnings can be switched on by options(warnPartialMatchDollar = TRUE).

注意,说明书内容丰富,一定要充分消化。我格式化了内容,在相关的后面添加了 Stack Overflow 线程。


提供的链接值得长期阅读。