R前向选择迫使变量留在方程中

Question

我是运行逻辑回归，有 755 个观察值和 16 个变量。我正在使用 glm 函数进行变量选择。 glm 找到了 8 个变量的最佳模型。我希望这些变量被迫保留并使用 glm 和 step 找到下一个最好的 9 变量模型（见下文）。我想这样做，直到我对 9-16 个变量的模型完成前向选择（选择了所有 16 个变量）。

我的代码看起来像

饱和模型

full=glm(PREVAP ~ SEX + TOTCHOL + AGE + SYSBP + DIABP + as.factor(CURSMOKE)    
+ CIGPDAY + BMI + as.factor(DIABETES) + as.factor(BPMEDS) + HEARTRTE + 
  GLUCOSE + as.factor(EDUC) + TIME + HDLC + LDLC, data=training, 
  family=binomial(link="logit"))
summary(full)
anova(full,test="Chisq")
full.forward <- step(null,     
scope=list(lower=null,upper=full),direction="forward", 
      family=binomial(link="logit"))

这给了我一个包含 8 个因素的模型我需要在下一个模型中强制使用这些因素，并使用前向选择找到一个包含 9 个因素的模型。怎么办？

有人告诉我 bestglm 和 glmnet 也允许这样做，但我不知道这些软件包。

你能帮忙吗？这些包有很多选择。

真诚的，玛丽·A·马里昂

Answer 1

您可以通过设置 step 函数的 object 和 scope 参数来实现。

这里，使用糖尿病在皮马印第安妇女数据集中：

library(MASS)
data = rbind(Pima.te, Pima.tr)
data$type = ifelse(data$type == "Yes", 1, 0)

full = glm(type~., family = "binomial", data = data)

summary(full)
anova(full, test = "Chisq")

nothing = glm(type~1, data = data, family = "binomial")

full.forward = step(nothing,
                    scope = list(lower = formula(nothing),
                                 upper = formula(full)),
                    direction = "forward")

forward = step(full.forward,
               scope = list(lower = formula(full.forward),
                            upper = formula(full)),
               direction = "forward")

R前向选择迫使变量留在方程中

R forward selection forcing variables to stay in equation

regression

r

饱和模型