使用 ggplot 绘制二项式 GLM
Plotting a Binomial GLM using ggplot
我创建了以下二项式 GLM 模型并希望使用 ggplot 绘制它。我遇到的问题是如何将多个变量绘制为 x 轴和 y 轴。
这里使用年龄+教育+wantsMore来预测notUsing,using。由于有多个变量,我如何将这些作为参数添加到 ggplot 的 aes() 中?
下面这个应该更清楚了。
模型已创建;
my_model = glm(cbind(notUsing, using) ~ age + education + wantsMore,
data = contraceptive2,
family = binomial(link = "logit"))
绘制模型; (希望这将使我更容易看到我失败的地方。我需要为我的数据重放 x 和 y,但由于它们包含多个变量,我不知道如何执行此操作)
#make predictions
my_model_preds = predict(my_model, contraceptive2, se.fit = TRUE, type = 'response')
#ggplot of the model
ggplot(contraceptive2, aes(x, y)) +
geom_point() +
geom_line(aes(x, my_model_preds), col = 'blue')
部分数据(如果需要);
head(contraceptive2)
age education wantsMore notUsing using
1 <25 low yes 53 6
2 <25 low no 10 4
3 <25 high yes 200 52
4 <25 high no 50 10
5 25-29 low yes 60 14
6 25-29 low no 19 10
7 25-29 high yes 155 54
8 25-29 high no 65 27
9 30-39 low yes 112 33
10 30-39 low no 77 80
11 30-39 high yes 118 46
12 30-39 high no 68 78
13 40-49 low yes 35 6
14 40-49 low no 46 48
15 40-49 high yes 8 8
16 40-49 high no 12 31
根据提供的数据,我们可以看到,因为您的两个自变量是二元的,所以可以使用带有小平面和颜色的误差条来绘制整个模型:
df <- with(contraceptive2,
expand.grid(age = unique(age), education = unique(education),
wantsMore = unique(wantsMore)))
fits <- predict(my_model, newdata = df, se.fit = TRUE)
# Get odds from modrl
df$prediction <- exp(fits$fit)
df$upper <- exp(fits$fit + 1.96 * fits$se.fit)
df$lower <- exp(fits$fit - 1.96 * fits$se.fit)
# Convert odds to probabilities
df$prediction <- df$prediction / (1 + df$prediction)
df$upper <- df$upper / (1 + df$upper)
df$lower <- df$lower / (1 + df$lower)
# Plot probabilities
ggplot(df, aes(age, prediction)) +
geom_errorbar(aes(ymin = lower, ymax = upper, colour = wantsMore),
width = 0.25, size = 1, position = position_dodge(width = 0.4)) +
geom_point(aes(fill = wantsMore), shape = 21, size = 3,
position = position_dodge(width = 0.4)) +
facet_grid(~education) +
theme_light(base_size = 16) +
scale_y_continuous(name = "Probability of not using", limits = c(0, 1),
labels = scales::percent)
我创建了以下二项式 GLM 模型并希望使用 ggplot 绘制它。我遇到的问题是如何将多个变量绘制为 x 轴和 y 轴。
这里使用年龄+教育+wantsMore来预测notUsing,using。由于有多个变量,我如何将这些作为参数添加到 ggplot 的 aes() 中?
下面这个应该更清楚了。
模型已创建;
my_model = glm(cbind(notUsing, using) ~ age + education + wantsMore,
data = contraceptive2,
family = binomial(link = "logit"))
绘制模型; (希望这将使我更容易看到我失败的地方。我需要为我的数据重放 x 和 y,但由于它们包含多个变量,我不知道如何执行此操作)
#make predictions
my_model_preds = predict(my_model, contraceptive2, se.fit = TRUE, type = 'response')
#ggplot of the model
ggplot(contraceptive2, aes(x, y)) +
geom_point() +
geom_line(aes(x, my_model_preds), col = 'blue')
部分数据(如果需要);
head(contraceptive2)
age education wantsMore notUsing using
1 <25 low yes 53 6
2 <25 low no 10 4
3 <25 high yes 200 52
4 <25 high no 50 10
5 25-29 low yes 60 14
6 25-29 low no 19 10
7 25-29 high yes 155 54
8 25-29 high no 65 27
9 30-39 low yes 112 33
10 30-39 low no 77 80
11 30-39 high yes 118 46
12 30-39 high no 68 78
13 40-49 low yes 35 6
14 40-49 low no 46 48
15 40-49 high yes 8 8
16 40-49 high no 12 31
根据提供的数据,我们可以看到,因为您的两个自变量是二元的,所以可以使用带有小平面和颜色的误差条来绘制整个模型:
df <- with(contraceptive2,
expand.grid(age = unique(age), education = unique(education),
wantsMore = unique(wantsMore)))
fits <- predict(my_model, newdata = df, se.fit = TRUE)
# Get odds from modrl
df$prediction <- exp(fits$fit)
df$upper <- exp(fits$fit + 1.96 * fits$se.fit)
df$lower <- exp(fits$fit - 1.96 * fits$se.fit)
# Convert odds to probabilities
df$prediction <- df$prediction / (1 + df$prediction)
df$upper <- df$upper / (1 + df$upper)
df$lower <- df$lower / (1 + df$lower)
# Plot probabilities
ggplot(df, aes(age, prediction)) +
geom_errorbar(aes(ymin = lower, ymax = upper, colour = wantsMore),
width = 0.25, size = 1, position = position_dodge(width = 0.4)) +
geom_point(aes(fill = wantsMore), shape = 21, size = 3,
position = position_dodge(width = 0.4)) +
facet_grid(~education) +
theme_light(base_size = 16) +
scale_y_continuous(name = "Probability of not using", limits = c(0, 1),
labels = scales::percent)