为什么 match.call 有用?
Why is match.call useful?
在某些 R 函数的主体中,例如 lm
我看到调用了 match.call
函数。正如其帮助页面所说,在函数内部使用时match.call
returns 指定参数名称的调用;这对于将大量参数传递给另一个函数应该很有用。
例如,在 lm
函数中我们看到对函数 model.frame
...
的调用
function (formula, data, subset, weights, na.action, method = "qr",
model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE,
contrasts = NULL, offset, ...)
{
cl <- match.call()
mf <- match.call(expand.dots = FALSE)
m <- match(c("formula", "data", "subset", "weights", "na.action",
"offset"), names(mf), 0L)
mf <- mf[c(1L, m)]
mf$drop.unused.levels <- TRUE
mf[[1L]] <- quote(stats::model.frame)
mf <- eval(mf, parent.frame())
...
...为什么这比直接调用 model.frame
指定参数名称 更有用?
function (formula, data, subset, weights, na.action, method = "qr",
model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE,
contrasts = NULL, offset, ...)
{
mf <- model.frame(formula = formula, data = data,
subset = subset, weights = weights, subset = subset)
...
(请注意,match.call
还有一个我不讨论的用途,将调用存储在结果对象中。)
这里相关的一个原因是 match.call
捕获调用的语言而不对其进行评估,在这种情况下它允许 lm
处理一些 "missing" 变量作为 "optional"。考虑:
lm(x ~ y, data.frame(x=1:10, y=runif(10)))
对比:
lm2 <- function (
formula, data, subset, weights, na.action, method = "qr",
model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE,
contrasts = NULL, offset, ...
) {
mf <- model.frame(
formula = formula, data = data, subset = subset, weights = weights
)
}
lm2(x ~ y, data.frame(x=1:10, y=runif(10)))
## Error in model.frame.default(formula = formula, data = data, subset = subset, :
## invalid type (closure) for variable '(weights)'
在lm2
中,因为weights
是"missing"但你仍然在weights=weights
中使用它,R试图使用stats::weights
函数,这显然是不是预期的。您可以通过在调用 model.frame
之前测试缺失来解决这个问题,但此时 match.call
开始看起来不错。看看如果我们 debug
调用会发生什么:
debug(lm2)
lm2(x ~ y, data.frame(x=1:10, y=runif(10)))
## debugging in: lm2(x ~ y, data.frame(x = 1:10, y = runif(10)))
## debug at #5: {
## mf <- model.frame(formula = formula, data = data, subset = subset,
## weights = weights)
## }
Browse[2]> match.call()
## lm2(formula = x ~ y, data = data.frame(x = 1:10, y = runif(10)))
match.call
根本不涉及缺少的参数。
您可能会争辩说,应该通过默认值将可选参数显式设为可选,但这不是这里发生的事情。
举个例子。其中,calc_1 是一个带有大量数字参数的函数,需要对它们进行加法和乘法运算。它将这项工作委托给 calc_2 ,这是一个接受大部分参数的辅助函数。但是 calc_2 还需要一些额外的参数(q 到 t),calc_1 不能从它自己的实际参数中提供这些参数。相反,它将它们作为附加项传递。
如果为了显示 calc_1 通过它的所有内容,对 calc_2 的调用将是真正可怕的。因此,相反,我们假设如果 calc_1 和 calc_2 共享一个形参,它们会赋予它相同的名称。这使得编写一个调用程序成为可能,该调用程序计算出哪些参数 calc_1 可以传递给 calc_2 ,构造一个将这样做的调用,并提供额外的值来完成它。下面代码中的注释应该清楚这一点。
顺便说一下,仅 %>% 和 str_c 需要库“tidyverse”,我用 calc_2 定义了它,库“assertthat”用于一个断言。 (尽管在实际程序中,我会放入断言来检查参数。)
这是输出:
> calc_1( a=1, b=11, c=2, d=22, e=3, f=33, g=4, h=44, i=5, j=55, k=6
+ , l=66, m=7, n=77, o=8, p=88
+ )
[1] "87654321QRST"
代码如下:
library( tidyverse )
library( rlang )
library( assertthat )
`%(%` <- call_with_extras
#
# This is the operator for calling
# a function with arguments passed
# from its parent, supplemented
# with extras. See call_with_extras()
# below.
# A function with a very long
# argument list. It wants to call
# a related function which takes
# most of these arguments and
# so has a long argument list too.
# The second function takes some
# extra arguments.
#
calc_1 <- function( a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p )
{
calc_2 %(% list( t = "T", q = "Q", s = "S", r = "R" )
#
# Call it with those extras, passing
# all the others that calc_2() needs
# as well. %(% is my function for
# doing so: see below.
}
# The function that we call above. It
# uses its own arguments q to t , as
# well as those from calc_1() .
#
calc_2 <- function( a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t )
{
( a + c * 10 + e * 100 + g * 1000 + i * 10000 + k * 100000 +
m * 1000000 + o * 10000000 ) %>%
str_c( q, r, s, t )
}
# Calls function f2 . Passes f2 whichever
# arguments it needs from its caller.
# Corresponding formals should have the
# same name in both. Also passes f2 extra
# arguments from the named list extra.
# The names should have the same names as
# corresponding formals of f2 .
#
call_with_extras <- function( f2, extras )
{
f1_call <- match.call( sys.function(1), sys.call(1) )
# A call object.
f1_actuals <- as.list( f1_call %>% tail(-1) )
# Named list of f1's actuals.
f1_formals <- names( f1_actuals )
# Names of f1's formals.
f2_formals <- names( formals( f2 ) )
# Names of f2's formals.
f2_formals_from_f1 <- intersect( f2_formals, f1_formals )
# Names of f2's formals which f1 can supply.
f2_formals_not_from_f1 <- setdiff( f2_formals, f1_formals )
# Names of f2's formals which f1 can't supply.
extra_formals <- names( extras )
# Names of f2's formals supplied as extras.
assert_that( setequal( extra_formals, f2_formals_not_from_f1 ) )
# The last two should be equal.
f2_actuals_from_f1 <- f1_actuals[ f2_formals_from_f1 ]
# List of actuals which f1 can supply to f2.
f2_actuals <- append( f2_actuals_from_f1, extras )
# All f2's actuals.
f2_call <- call2( f2, !!! f2_actuals )
# Call to f2.
eval( f2_call )
# Run it.
}
# Test it.
#
calc_1( a=1, b=11, c=2, d=22, e=3, f=33, g=4, h=44, i=5, j=55, k=6
, l=66, m=7, n=77, o=8, p=88
)
在某些 R 函数的主体中,例如 lm
我看到调用了 match.call
函数。正如其帮助页面所说,在函数内部使用时match.call
returns 指定参数名称的调用;这对于将大量参数传递给另一个函数应该很有用。
例如,在 lm
函数中我们看到对函数 model.frame
...
function (formula, data, subset, weights, na.action, method = "qr",
model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE,
contrasts = NULL, offset, ...)
{
cl <- match.call()
mf <- match.call(expand.dots = FALSE)
m <- match(c("formula", "data", "subset", "weights", "na.action",
"offset"), names(mf), 0L)
mf <- mf[c(1L, m)]
mf$drop.unused.levels <- TRUE
mf[[1L]] <- quote(stats::model.frame)
mf <- eval(mf, parent.frame())
...
...为什么这比直接调用 model.frame
指定参数名称 更有用?
function (formula, data, subset, weights, na.action, method = "qr",
model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE,
contrasts = NULL, offset, ...)
{
mf <- model.frame(formula = formula, data = data,
subset = subset, weights = weights, subset = subset)
...
(请注意,match.call
还有一个我不讨论的用途,将调用存储在结果对象中。)
这里相关的一个原因是 match.call
捕获调用的语言而不对其进行评估,在这种情况下它允许 lm
处理一些 "missing" 变量作为 "optional"。考虑:
lm(x ~ y, data.frame(x=1:10, y=runif(10)))
对比:
lm2 <- function (
formula, data, subset, weights, na.action, method = "qr",
model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE,
contrasts = NULL, offset, ...
) {
mf <- model.frame(
formula = formula, data = data, subset = subset, weights = weights
)
}
lm2(x ~ y, data.frame(x=1:10, y=runif(10)))
## Error in model.frame.default(formula = formula, data = data, subset = subset, :
## invalid type (closure) for variable '(weights)'
在lm2
中,因为weights
是"missing"但你仍然在weights=weights
中使用它,R试图使用stats::weights
函数,这显然是不是预期的。您可以通过在调用 model.frame
之前测试缺失来解决这个问题,但此时 match.call
开始看起来不错。看看如果我们 debug
调用会发生什么:
debug(lm2)
lm2(x ~ y, data.frame(x=1:10, y=runif(10)))
## debugging in: lm2(x ~ y, data.frame(x = 1:10, y = runif(10)))
## debug at #5: {
## mf <- model.frame(formula = formula, data = data, subset = subset,
## weights = weights)
## }
Browse[2]> match.call()
## lm2(formula = x ~ y, data = data.frame(x = 1:10, y = runif(10)))
match.call
根本不涉及缺少的参数。
您可能会争辩说,应该通过默认值将可选参数显式设为可选,但这不是这里发生的事情。
举个例子。其中,calc_1 是一个带有大量数字参数的函数,需要对它们进行加法和乘法运算。它将这项工作委托给 calc_2 ,这是一个接受大部分参数的辅助函数。但是 calc_2 还需要一些额外的参数(q 到 t),calc_1 不能从它自己的实际参数中提供这些参数。相反,它将它们作为附加项传递。
如果为了显示 calc_1 通过它的所有内容,对 calc_2 的调用将是真正可怕的。因此,相反,我们假设如果 calc_1 和 calc_2 共享一个形参,它们会赋予它相同的名称。这使得编写一个调用程序成为可能,该调用程序计算出哪些参数 calc_1 可以传递给 calc_2 ,构造一个将这样做的调用,并提供额外的值来完成它。下面代码中的注释应该清楚这一点。
顺便说一下,仅 %>% 和 str_c 需要库“tidyverse”,我用 calc_2 定义了它,库“assertthat”用于一个断言。 (尽管在实际程序中,我会放入断言来检查参数。)
这是输出:
> calc_1( a=1, b=11, c=2, d=22, e=3, f=33, g=4, h=44, i=5, j=55, k=6
+ , l=66, m=7, n=77, o=8, p=88
+ )
[1] "87654321QRST"
代码如下:
library( tidyverse )
library( rlang )
library( assertthat )
`%(%` <- call_with_extras
#
# This is the operator for calling
# a function with arguments passed
# from its parent, supplemented
# with extras. See call_with_extras()
# below.
# A function with a very long
# argument list. It wants to call
# a related function which takes
# most of these arguments and
# so has a long argument list too.
# The second function takes some
# extra arguments.
#
calc_1 <- function( a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p )
{
calc_2 %(% list( t = "T", q = "Q", s = "S", r = "R" )
#
# Call it with those extras, passing
# all the others that calc_2() needs
# as well. %(% is my function for
# doing so: see below.
}
# The function that we call above. It
# uses its own arguments q to t , as
# well as those from calc_1() .
#
calc_2 <- function( a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t )
{
( a + c * 10 + e * 100 + g * 1000 + i * 10000 + k * 100000 +
m * 1000000 + o * 10000000 ) %>%
str_c( q, r, s, t )
}
# Calls function f2 . Passes f2 whichever
# arguments it needs from its caller.
# Corresponding formals should have the
# same name in both. Also passes f2 extra
# arguments from the named list extra.
# The names should have the same names as
# corresponding formals of f2 .
#
call_with_extras <- function( f2, extras )
{
f1_call <- match.call( sys.function(1), sys.call(1) )
# A call object.
f1_actuals <- as.list( f1_call %>% tail(-1) )
# Named list of f1's actuals.
f1_formals <- names( f1_actuals )
# Names of f1's formals.
f2_formals <- names( formals( f2 ) )
# Names of f2's formals.
f2_formals_from_f1 <- intersect( f2_formals, f1_formals )
# Names of f2's formals which f1 can supply.
f2_formals_not_from_f1 <- setdiff( f2_formals, f1_formals )
# Names of f2's formals which f1 can't supply.
extra_formals <- names( extras )
# Names of f2's formals supplied as extras.
assert_that( setequal( extra_formals, f2_formals_not_from_f1 ) )
# The last two should be equal.
f2_actuals_from_f1 <- f1_actuals[ f2_formals_from_f1 ]
# List of actuals which f1 can supply to f2.
f2_actuals <- append( f2_actuals_from_f1, extras )
# All f2's actuals.
f2_call <- call2( f2, !!! f2_actuals )
# Call to f2.
eval( f2_call )
# Run it.
}
# Test it.
#
calc_1( a=1, b=11, c=2, d=22, e=3, f=33, g=4, h=44, i=5, j=55, k=6
, l=66, m=7, n=77, o=8, p=88
)