R purrr::map/broom::tidy 不打印 GLM 模型的概率值
R purrr::map/broom::tidy Does Not Print Probability Values for GLM Model
当 运行 R 中的对数线性 GLM 时,我 运行 遇到 purrr::map 和 broom::tidy 的问题。由于某种原因,模型 p 值不打印当 运行 许多模型但使用单个模型打印时。最后,我希望多个模型像在单个模型情况下一样为每个模型打印 p 值。提供的示例使用内置的 "Titanic" 数据集(参见 William King 的 website)。
data(Titanic)
#convert to data frame
T.df <- as.data.frame(Titanic)
head(T.df)
#run glm as loglinear model
model1 <- glm(Freq ~ Sex * Survived, family = poisson, data = T.df)
#print model with tidy--p-values print here
broom::tidy(anova(model1, test = "Chisq"))
#Now run multiple models by class
#Note the models print just fine but without p values
T.df %>%
tidyr::nest(-Class) %>%
dplyr::mutate(
fit = purrr::map(data, ~ anova(glm(Freq ~ Sex * Survived, family = poisson, data = .x)), test="Chisq"),
tidied = purrr::map(fit, broom::tidy)
) %>%
tidyr::unnest(tidied)
在我思考的同时,如何阻止 broom::tidy 打印有关无法识别的列的警告消息?
提前致谢。
问题出在 anova
的移位括号中,test = "Chisq"
包裹在 anova
调用之外,即
anova(glm(Freq ~ Sex * Survived, family = poisson, data = .x)), test="Chisq")
^^^
使用正确的右括号实现
T.df %>%
nest(-Class) %>%
mutate(tidied = map(data, ~
glm(Freq ~ Sex * Survived, family = poisson, data = .x) %>%
anova(., test = "Chisq") %>%
broom::tidy(.))) %>%
unnest(tidied)
# A tibble: 16 x 7
# Class term df Deviance Resid..Df Resid..Dev p.value
# <fct> <chr> <int> <dbl> <int> <dbl> <dbl>
# 1 1st NULL NA NA 7 590. NA
# 2 1st Sex 1 3.78 6 586. 5.20e- 2
# 3 1st Survived 1 20.4 5 566. 6.28e- 6
# 4 1st Sex:Survived 1 162. 4 404. 4.78e- 37
# 5 2nd NULL NA NA 7 476. NA
# 6 2nd Sex 1 18.9 6 457. 1.37e- 5
# 7 2nd Survived 1 8.47 5 449. 3.62e- 3
# 8 2nd Sex:Survived 1 163. 4 286. 2.54e- 37
# 9 3rd NULL NA NA 7 876. NA
#10 3rd Sex 1 145. 6 732. 2.54e- 33
#11 3rd Survived 1 181. 5 550. 2.36e- 41
#12 3rd Sex:Survived 1 57.8 4 493. 2.92e- 14
#13 Crew NULL NA NA 7 2535. NA
#14 Crew Sex 1 1014. 6 1522. 2.02e-222
#15 Crew Survived 1 252. 5 1269. 7.85e- 57
#16 Crew Sex:Survived 1 42.4 4 1227. 7.63e- 11
当 运行 R 中的对数线性 GLM 时,我 运行 遇到 purrr::map 和 broom::tidy 的问题。由于某种原因,模型 p 值不打印当 运行 许多模型但使用单个模型打印时。最后,我希望多个模型像在单个模型情况下一样为每个模型打印 p 值。提供的示例使用内置的 "Titanic" 数据集(参见 William King 的 website)。
data(Titanic)
#convert to data frame
T.df <- as.data.frame(Titanic)
head(T.df)
#run glm as loglinear model
model1 <- glm(Freq ~ Sex * Survived, family = poisson, data = T.df)
#print model with tidy--p-values print here
broom::tidy(anova(model1, test = "Chisq"))
#Now run multiple models by class
#Note the models print just fine but without p values
T.df %>%
tidyr::nest(-Class) %>%
dplyr::mutate(
fit = purrr::map(data, ~ anova(glm(Freq ~ Sex * Survived, family = poisson, data = .x)), test="Chisq"),
tidied = purrr::map(fit, broom::tidy)
) %>%
tidyr::unnest(tidied)
在我思考的同时,如何阻止 broom::tidy 打印有关无法识别的列的警告消息?
提前致谢。
问题出在 anova
的移位括号中,test = "Chisq"
包裹在 anova
调用之外,即
anova(glm(Freq ~ Sex * Survived, family = poisson, data = .x)), test="Chisq")
^^^
使用正确的右括号实现
T.df %>%
nest(-Class) %>%
mutate(tidied = map(data, ~
glm(Freq ~ Sex * Survived, family = poisson, data = .x) %>%
anova(., test = "Chisq") %>%
broom::tidy(.))) %>%
unnest(tidied)
# A tibble: 16 x 7
# Class term df Deviance Resid..Df Resid..Dev p.value
# <fct> <chr> <int> <dbl> <int> <dbl> <dbl>
# 1 1st NULL NA NA 7 590. NA
# 2 1st Sex 1 3.78 6 586. 5.20e- 2
# 3 1st Survived 1 20.4 5 566. 6.28e- 6
# 4 1st Sex:Survived 1 162. 4 404. 4.78e- 37
# 5 2nd NULL NA NA 7 476. NA
# 6 2nd Sex 1 18.9 6 457. 1.37e- 5
# 7 2nd Survived 1 8.47 5 449. 3.62e- 3
# 8 2nd Sex:Survived 1 163. 4 286. 2.54e- 37
# 9 3rd NULL NA NA 7 876. NA
#10 3rd Sex 1 145. 6 732. 2.54e- 33
#11 3rd Survived 1 181. 5 550. 2.36e- 41
#12 3rd Sex:Survived 1 57.8 4 493. 2.92e- 14
#13 Crew NULL NA NA 7 2535. NA
#14 Crew Sex 1 1014. 6 1522. 2.02e-222
#15 Crew Survived 1 252. 5 1269. 7.85e- 57
#16 Crew Sex:Survived 1 42.4 4 1227. 7.63e- 11