哪个线性模型摘要行对应于公式中的哪一项?
Which linear model summary row corresponds to which term in formula?
线性模型的摘要使用某些字符串来表示其输出中的系数,例如:
summary(lm(
target ~ some.bool + some.factor + some.factor*some.value +
some.factor:some.other,
data.frame(target=rnorm(100), some.bool=sample(c(T, F), 100, T),
some.factor=sample(c('Y', 'N', 'M'), 100, T), some.value=rnorm(100),
some.other=rnorm(100))))
结果 table 的名称为:
some.boolTRUE
,
some.factorN
,
some.factorY
,
some.value
,
some.factorN:some.value
,
some.factorY:some.value
,
some.factorM:some.other
,
some.factorN:some.other
,
some.factorY:some.other
.
如何以编程方式找出 table 的哪些行对应于输入公式的哪些项?我想要一些映射,例如:
`some.boolTRUE` → some.bool
`some.factorN`: → some.factor, some.factor*some.value
`some.factorY`: → some.factor, some.factor*some.value
`some.value`: → some.factor*some.value
`some.factorN:some.value`: → some.factor*some.value
`some.factorN:some.other`: → some.factor:some.other
我的目标是为结果准备一种特定的表示形式,其中线性回归的数据按输入项分组呈现。
因此,我注意到生成这些名称的代码位于称为外部 C 函数的 model.matrix
函数的深处。我可以使用如下 hack 来恢复由术语构建的名称(term
是从公式本身中取出的 expression/symbol 对象):
names.for.term <- function(term, data, order.as.in=term) {
# construct a simple formula that has only the requested term
f <- formula(substitute(~ x, list(x=term)))
# make a terms object for manipulation
term.terms <- terms(f, data=data)
# what order do we want to consider variables in?
requested.order <- na.omit(match(
row.names(attr(terms(order.as.in), 'factors')),
row.names(attr(term.terms, 'factors'))))
# force the order of variables (setting row.names is enough;
# values in this array are not important for the process of building
# strings if you have only a single summand. if not, good luck)
row.names(attr(term.terms, 'factors')) <-
row.names(attr(term.terms, 'factors'))[requested.order]
# we need model frame object to have columns in the same order as
# rows above; types of variables (e.g. factors) are inferred from here
m <- model.frame(f, data)[requested.order]
# call deep into C code
dimnames(.External2(stats:::C_modelmatrix, term.terms, m))[[2]][-1]
}
丑陋,但有效。由于字符串取决于此函数调用在术语中遇到的变量的顺序,因此您可能希望将完整公式作为 order.as.in
传递。现在唯一剩下的就是反转映射,这在这一点上是微不足道的。
线性模型的摘要使用某些字符串来表示其输出中的系数,例如:
summary(lm(
target ~ some.bool + some.factor + some.factor*some.value +
some.factor:some.other,
data.frame(target=rnorm(100), some.bool=sample(c(T, F), 100, T),
some.factor=sample(c('Y', 'N', 'M'), 100, T), some.value=rnorm(100),
some.other=rnorm(100))))
结果 table 的名称为:
some.boolTRUE
,
some.factorN
,
some.factorY
,
some.value
,
some.factorN:some.value
,
some.factorY:some.value
,
some.factorM:some.other
,
some.factorN:some.other
,
some.factorY:some.other
.
如何以编程方式找出 table 的哪些行对应于输入公式的哪些项?我想要一些映射,例如:
`some.boolTRUE` → some.bool
`some.factorN`: → some.factor, some.factor*some.value
`some.factorY`: → some.factor, some.factor*some.value
`some.value`: → some.factor*some.value
`some.factorN:some.value`: → some.factor*some.value
`some.factorN:some.other`: → some.factor:some.other
我的目标是为结果准备一种特定的表示形式,其中线性回归的数据按输入项分组呈现。
因此,我注意到生成这些名称的代码位于称为外部 C 函数的 model.matrix
函数的深处。我可以使用如下 hack 来恢复由术语构建的名称(term
是从公式本身中取出的 expression/symbol 对象):
names.for.term <- function(term, data, order.as.in=term) {
# construct a simple formula that has only the requested term
f <- formula(substitute(~ x, list(x=term)))
# make a terms object for manipulation
term.terms <- terms(f, data=data)
# what order do we want to consider variables in?
requested.order <- na.omit(match(
row.names(attr(terms(order.as.in), 'factors')),
row.names(attr(term.terms, 'factors'))))
# force the order of variables (setting row.names is enough;
# values in this array are not important for the process of building
# strings if you have only a single summand. if not, good luck)
row.names(attr(term.terms, 'factors')) <-
row.names(attr(term.terms, 'factors'))[requested.order]
# we need model frame object to have columns in the same order as
# rows above; types of variables (e.g. factors) are inferred from here
m <- model.frame(f, data)[requested.order]
# call deep into C code
dimnames(.External2(stats:::C_modelmatrix, term.terms, m))[[2]][-1]
}
丑陋,但有效。由于字符串取决于此函数调用在术语中遇到的变量的顺序,因此您可能希望将完整公式作为 order.as.in
传递。现在唯一剩下的就是反转映射,这在这一点上是微不足道的。