结合模型的两个 MuMIn 子集的项在 R 中创建新的子集

Combining the terms of two MuMIn subset of models to create new subset in R

我正在进行一项分析,我分两步在变量之间进行选择。第 1 步:从两组变量(例如,内在和外在变量)中的每一个中选择最佳变量和变量组合。第 2 步:采用每个子集的最佳变量组合,并创建一组新模型,这些模型仅具有每个子集中预先 select 的变量所允许的组合。我正在使用 MuMin 包中的函数 dredge

我可以轻松地手动对变量组合进行子集化,如下所示。

第 1 步:select 每组变量的变量组合

library(MuMIn)
data(Cement)
# Choosing between X1 & X2 - Set of variable #1
fm1 <- lm(y ~ X1 + X2, Cement, na.action = na.fail)
m1 <- dredge(fm1)
sm1 <- subset(m1, delta < 32) # delta < 32 is only chosen to have 2 selected models
sm1
# Global model call: lm(formula = y ~ X1 + X2, data = Cement, na.action = na.fail)
# ---
#   Model selection table 
#   (Intrc)    X1     X2 df  logLik  AICc delta weight
# 4   52.58 1.468 0.6623  4 -28.156  69.3  0.00      1
# 3   57.42       0.7891  3 -46.035 100.7 31.42      0

# Choosing between X3 & X4 - Set of variable #2
fm2 <- lm(y ~ X3 + X4, Cement, na.action = na.fail)
m2 <- dredge(fm2)
sm2 <- subset(m2, delta < 20)
sm2
# Global model call: lm(formula = y ~ X3 + X4, data = Cement, na.action = na.fail)
# ---
#  Model selection table 
#   (Intrc)   X3      X4 df  logLik  AICc delta weight
# 4   131.3 -1.2 -0.7246  4 -35.372  83.7  0.00      1
# 3   117.6      -0.7382  3 -45.872 100.4 16.67      0

第 2 步:查看具有两组变量的模型,但仅包括上面针对每组变量 selected 的组合。

# Only looking at the combinations chosen above with subset.
fm3 <- lm(y ~., Cement, na.action = na.fail)
m3 <- dredge(fm3, subset = ((X1 & X2) | X2) & ((X3 & X4) | X4))
m3
# Global model call: lm(formula = y ~ ., data = Cement, na.action = na.fail)
# ---
#  Model selection table  
#    (Intrc)    X1      X2      X3      X4 df  logLik  AICc delta weight
# 12   71.65 1.452  0.4161         -0.2365  5 -26.933  72.4  0.00  0.921
# 15  203.60       -0.9234 -1.4480 -1.5570  5 -29.734  78.0  5.60  0.056
# 16   62.41 1.551  0.5102  0.1019 -0.1441  6 -26.918  79.8  7.40  0.023
# 11   94.16        0.3109         -0.4569  4 -45.761 104.5 32.08  0.000

当你只有几个变量时,subset 效果很好,就像上面的例子一样,但在我的例子中,我在每组中有更多的变量 selected。

有没有办法不用手动指定子集中的变量就可以做到这一点?

非常感谢!

这不是最好的解决方案,但可以满足您的需求。

library(MuMIn)
options(na.action = na.fail)
fm1 <- lm(y ~ X1 + X2, Cement)
m1 <- dredge(fm1)
ms1 <- subset(m1, delta < 32)
fm2 <- lm(y ~ X3 + X4, Cement)
m2 <- dredge(fm2)
ms2 <- subset(m2, delta < 20)

a1 <- !is.na(ms1[, attr(ms1, "terms")])
a2 <- !is.na(ms2[, attr(ms2, "terms")])
allterms <- c(attr(ms1, "terms"), attr(ms2, "terms"))
allterms[allterms == "(Intercept)"] <- "1"
n1 <- nrow(a1)
n2 <- nrow(a2)

res <- vector("list", n1 * n2)
k <- 0L
for(i in 1L:n1) for(j in 1L:n2) {
    frm <- reformulate(allterms[c(a1[i, ], a2[j, ])], response = ".")
    res[[k <- k + 1L]] <- update(fm1, formula = frm)
}

model.sel(res)
# Model selection table 
#    (Intrc)    X1      X2      X3      X4 df  logLik  AICc delta weight
# 4    52.58 1.468  0.6623                  4 -28.156  69.3  0.00  0.566
# 2    71.65 1.452  0.4161         -0.2365  5 -26.933  72.4  3.13  0.119
# 3    48.19 1.696  0.6569  0.2500          5 -26.952  72.5  3.16  0.116
# 10  103.10 1.440                 -0.6140  4 -29.817  72.6  3.32  0.107
# ...