lapply 具有“$”功能

lapply with "$" function

假设我有一个 data.frames

的列表
dflist <- list(data.frame(a=1:3), data.frame(b=10:12, a=4:6))

如果我想从列表中的每个项目中提取第一列,我可以这样做

lapply(dflist, `[[`, 1)
# [[1]]
# [1] 1 2 3
# 
# [[2]]
# [1] 10 11 12

为什么我不能以同样的方式使用“$”函数

lapply(dflist, `$`, "a")
# [[1]]
# NULL
# 
# [[2]]
# NULL

但这两个都有效:

lapply(dflist, function(x) x$a)
`$`(dflist[[1]], "a")

我意识到在这种情况下可以使用

lapply(dflist, `[[`, "a")

但我使用的 S4 对象似乎不允许通过 [[ 进行索引。例如

library(adegenet)
data(nancycats)
catpop <- genind2genpop(nancycats)
mylist <- list(catpop, catpop)

#works
catpop[[1]]$tab

#doesn't work
lapply(mylist, "$", "tab")
# Error in slot(x, name) : 
#   no slot of name "..." for this object of class "genpop"

#doesn't work
lapply(mylist, "[[", "tab")
# Error in FUN(X[[1L]], ...) : this S4 class is not subsettable

对于第一个例子,你可以这样做:

lapply(dflist, `$.data.frame`, "a")

对于第二个,使用 slot() 访问器函数

lapply(mylist, "slot", "tab")

我不确定 为什么 方法调度在第一种情况下不起作用,但是 ?lapplyNote 部分确实解决了这个问题其对 $:

等原始函数的 borked 方法调度问题
 Note:

 [...]

 For historical reasons, the calls created by ‘lapply’ are
 unevaluated, and code has been written (e.g., ‘bquote’) that
 relies on this.  This means that the recorded call is always of
 the form ‘FUN(X[[i]], ...)’, with ‘i’ replaced by the current
 (integer or double) index.  This is not normally a problem, but it
 can be if ‘FUN’ uses ‘sys.call’ or ‘match.call’ or if it is a
 primitive function that makes use of the call.  This means that it
 is often safer to call primitive functions with a wrapper, so that
 e.g. ‘lapply(ll, function(x) is.numeric(x))’ is required to ensure
 that method dispatch for ‘is.numeric’ occurs correctly.

所以这个问题似乎与 $ 以及它通常如何期望不带引号的名称作为第二个参数而不是字符串有关。看这个例子

dflist <- list(
    data.frame(a=1:3, z=31:33), 
    data.frame(b=10:12, a=4:6, z=31:33)
)
lapply(dflist, 
    function(x, z) {
        print(paste("z:",z)); 
        `$`(x,z)
    }, 
    z="a"
)

我们看到了结果

[1] "z: a"
[1] "z: a"
[[1]]
[1] 31 32 33

[[2]]
[1] 31 32 33

所以 z 值被设置为 "a",但是 $ 没有评估第二个参数。所以它返回 "z" 列而不是 "a" 列。这导致了这组有趣的结果

a<-"z"; `$`(dflist[[1]], a)
# [1] 1 2 3
a<-"z"; `$`(dflist[[1]], "z")
# [1] 31 32 33

a<-"z"; `$.data.frame`(dflist[[1]], a)
# [1] 31 32 33
a<-"z"; `$.data.frame`(dflist[[1]], "z")
# [1] 31 32 33

当我们直接调用 $.data.frame 时,我们绕过了在分派之前发生在原语中的标准解析(发生在源中 here 附近)。

lapply 的附加问题是它通过 ... 机制将参数传递给函数。例如

lapply(dflist, function(x, z) sys.call())
# [[1]]
# FUN(X[[2L]], ...)

# [[2]]
# FUN(X[[2L]], ...)

这意味着当调用 $ 时,它将 ... 解析为字符串 "..."。这解释了这种行为

dflist<- list(data.frame(a=1:3, "..."=11:13, check.names=F))
lapply(dflist, `$`, "a")
# [[1]]
# [1] 11 12 13

当您尝试自己使用 ... 时会发生同样的事情

f<-function(x,...) `$`(x, ...); 

f(dflist[[1]], "a");
# [1] 11 12 13
`$`(dflist[[1]], "a")
# [1] 1 2 3