glm.nb 的正常执行与样条模型的 geom_smooth 执行 glm.nb 之间的区别
difference between normal execution of glm.nb and geom_smooth execution of glm.nb for a spline model
我正在尝试为我的数据拟合两种不同条件下的负二项式 glm。
首先,一些玩具数据:
value times variable
1 82.21236 0.0000000 B
2 130.69185 0.0000000 A
3 159.10491 1.3131313 B
4 136.94357 0.6060606 A
5 192.22455 3.1313131 B
6 149.96539 3.1313131 A
7 115.91152 4.5454545 B
8 95.26077 4.2424242 A
9 73.79734 6.2626263 B
10 71.43359 6.1616162 A
11 106.83029 7.4747475 B
12 134.01414 7.0707071 A
13 44.66716 8.6868687 B
14 57.47017 8.6868687 A
15 41.02301 9.8989899 B
16 42.47003 9.4949495 A
17 66.26286 0.0000000 B
18 122.70818 0.0000000 A
19 187.01966 1.6161616 B
20 199.92595 1.6161616 A
21 138.26999 2.9292929 B
22 94.63155 3.2323232 A
23 149.99105 4.5454545 B
24 121.49791 4.1414141 A
25 107.17931 5.6565657 B
26 91.04130 5.7575758 A
27 84.03087 7.7777778 B
28 62.17754 7.6767677 A
29 52.81123 8.9898990 B
30 72.61422 7.5757576 A
31 52.33281 10.0000000 B
32 39.60495 9.6969697 A
(图书馆:
图书馆(ggplot2)
图书馆(质量)
)
我的目标是检索适合我的数据的模型以供下游分析,而不仅仅是将其可视化,所以我首先使用了 MASS 包的 glm.nb 功能,但它无法适应数据,我不知道为什么,特别是因为相同的方法在 ggplot 上成功了。
这是我到目前为止使用的代码:
ans = glm.nb(data = data, formula = value~splines::bs(times,Boundary.knots = c(0,10), knots = c(3), degree = 3, intercept = F ):variable)
data$glm_nb = predict(ans) #make the glm model and predict the new values
p=ggplot(data, aes(x=times, y=value, group=variable)) + #plot it
facet_grid(.~variable)+theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(x="times",y="Value")+
geom_point(size=2,alpha = 0.2) + theme_bw(base_size = 22)+
stat_smooth(method = "glm.nb", formula = y~splines::bs(x,Boundary.knots = c(0,10), knots = c(3), degree = 3, intercept = F), color = "green", size = 0.3)+ #ggplot with the same model
geom_line(aes(x = times, y = glm_nb), color = "blue")
ggplot 和其他模型都发出警告,指出我的 x 值不是整数,但 ggplot 仍然成功拟合数据。
特别奇怪的是,当我尝试使用 GLM 时,它确实有效! (相同的代码,只是将 glm.nb 换成 glm)。
我试图查找源代码以查看 geom_smooth 究竟做了什么,但我找不到它计算模型的精确行。
有什么想法吗?
问题只是您从 predict
调用中得到了错误的值,因为默认情况下它将 return 类型设置为 type = "link"
,而您正在寻找type = "response"
。如果您进行此更改,您将获得与 ggplot
相同的结果,它知道使用 type = "response"
而无需被询问:
data$glm_nb <- predict(ans, type = "response")
ggplot(data, aes(x = times, y = value, group = variable)) +
geom_point(size=2,alpha = 0.2) +
stat_smooth(method = "glm.nb",
formula = y ~ splines::bs(x, Boundary.knots = c(0,10),
knots = c(3), degree = 3,
intercept = FALSE),
color = "green", size = 0.3) +
geom_line(aes(x = times, y = glm_nb), color = "blue") +
facet_grid(.~variable) +
labs(x = "times", y = "Value") +
theme_bw(base_size = 22)
我正在尝试为我的数据拟合两种不同条件下的负二项式 glm。 首先,一些玩具数据:
value times variable
1 82.21236 0.0000000 B
2 130.69185 0.0000000 A
3 159.10491 1.3131313 B
4 136.94357 0.6060606 A
5 192.22455 3.1313131 B
6 149.96539 3.1313131 A
7 115.91152 4.5454545 B
8 95.26077 4.2424242 A
9 73.79734 6.2626263 B
10 71.43359 6.1616162 A
11 106.83029 7.4747475 B
12 134.01414 7.0707071 A
13 44.66716 8.6868687 B
14 57.47017 8.6868687 A
15 41.02301 9.8989899 B
16 42.47003 9.4949495 A
17 66.26286 0.0000000 B
18 122.70818 0.0000000 A
19 187.01966 1.6161616 B
20 199.92595 1.6161616 A
21 138.26999 2.9292929 B
22 94.63155 3.2323232 A
23 149.99105 4.5454545 B
24 121.49791 4.1414141 A
25 107.17931 5.6565657 B
26 91.04130 5.7575758 A
27 84.03087 7.7777778 B
28 62.17754 7.6767677 A
29 52.81123 8.9898990 B
30 72.61422 7.5757576 A
31 52.33281 10.0000000 B
32 39.60495 9.6969697 A
(图书馆: 图书馆(ggplot2) 图书馆(质量) )
我的目标是检索适合我的数据的模型以供下游分析,而不仅仅是将其可视化,所以我首先使用了 MASS 包的 glm.nb 功能,但它无法适应数据,我不知道为什么,特别是因为相同的方法在 ggplot 上成功了。 这是我到目前为止使用的代码:
ans = glm.nb(data = data, formula = value~splines::bs(times,Boundary.knots = c(0,10), knots = c(3), degree = 3, intercept = F ):variable)
data$glm_nb = predict(ans) #make the glm model and predict the new values
p=ggplot(data, aes(x=times, y=value, group=variable)) + #plot it
facet_grid(.~variable)+theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
labs(x="times",y="Value")+
geom_point(size=2,alpha = 0.2) + theme_bw(base_size = 22)+
stat_smooth(method = "glm.nb", formula = y~splines::bs(x,Boundary.knots = c(0,10), knots = c(3), degree = 3, intercept = F), color = "green", size = 0.3)+ #ggplot with the same model
geom_line(aes(x = times, y = glm_nb), color = "blue")
ggplot 和其他模型都发出警告,指出我的 x 值不是整数,但 ggplot 仍然成功拟合数据。
特别奇怪的是,当我尝试使用 GLM 时,它确实有效! (相同的代码,只是将 glm.nb 换成 glm)。
问题只是您从 predict
调用中得到了错误的值,因为默认情况下它将 return 类型设置为 type = "link"
,而您正在寻找type = "response"
。如果您进行此更改,您将获得与 ggplot
相同的结果,它知道使用 type = "response"
而无需被询问:
data$glm_nb <- predict(ans, type = "response")
ggplot(data, aes(x = times, y = value, group = variable)) +
geom_point(size=2,alpha = 0.2) +
stat_smooth(method = "glm.nb",
formula = y ~ splines::bs(x, Boundary.knots = c(0,10),
knots = c(3), degree = 3,
intercept = FALSE),
color = "green", size = 0.3) +
geom_line(aes(x = times, y = glm_nb), color = "blue") +
facet_grid(.~variable) +
labs(x = "times", y = "Value") +
theme_bw(base_size = 22)