具有不同截距的线性模型的斜率:如何 standardize/normalize 斜率进行比较?
slope for linear models with different intercepts: how to standardize/normalize slopes to compare?
我有代码将线性模型拟合到 R 中的数据。我的自变量是 AGE,因变量是 Costs。我很感兴趣成本是否随着年龄的增长而增加。然而,对于某些部分,我的截距是 10,而对于其他部分,我的截距是 1000,因此货币单位的增加没有帮助,因为每年 1 个单位的斜率对于 10 的截距和 1 个货币单位的斜率可能很大每年可以忽略不计。任何人都可以帮助解决这个问题,标准化 R 中的斜率,以便在用 lm
计算斜率后进行比较吗?
例子
data.ex <- data.frame(Age = c(c(1:10), c(1:10)),
Costs = c(11,12,13,14,15,12,17,18,19,20, 1001,1002,1003,1004,999,1006,1007,1008,1009,1010),
Type = c(rep("A", 10), rep("B", 10)))
pt <- ggplot(data = data.ex, aes(x=Age, y = Costs))+
geom_smooth(method="lm")+
facet_wrap(facets = "Type", nrow = 2)
plot(pt)
print(with(data.ex[data.ex$Type == "A", ], lm(Costs ~ Age)))
print(with(data.ex[data.ex$Type == "B", ], lm(Costs ~ Age)))
您可以尝试使用不同比例的 y 轴...但实际上我真的不明白您的问题。
Age = c(1:10)
Costs = c(11,12,13,14,15,12,17,18,19,20)
plot(Age, Costs, type="l", col="blue", lwd=2)
Costs = c(1001,1002,1003,1004,999,1006,1007,1008,1009,1010)
plot(Age, Costs, type="l", col="blue", lwd=2)
正如其他人所指出的,在 facet_wrap
中设置 scales = 'free'
将使两条线在图中更加明显。
关于你的另一个问题,你的措辞有点不清楚,但听起来你在说,"If baseline costs start at , then an increase of /year is substantial, whereas at a baseline cost of ,000, /year isn't significant. How do I show that difference?"
一种方法是根据其截距对每个组进行归一化:
library(dplyr)
# calculate intercepts for each group and extract them:
intercept.ex <- group_by(data.ex, Type) %>%
do(data.frame(intercept = coef(lm(Costs ~ Age, data = .))[1]))
# normalize the values in each group against their intercepts
data.ex <- merge(data.ex, intercept.ex) %>%
mutate(Costs = Costs / intercept)
# Age slope = 0.1002
print(with(data.ex[data.ex$Type == "A", ], lm(Costs ~ Age)))
# Age slope = 0.001037
print(with(data.ex[data.ex$Type == "B", ], lm(Costs ~ Age)))
我应该指出,这两个斜率仍然具有统计显着性,因为年龄和成本之间的关系非常清楚。但是相对效应大小它们之间有很大的不同。
我有代码将线性模型拟合到 R 中的数据。我的自变量是 AGE,因变量是 Costs。我很感兴趣成本是否随着年龄的增长而增加。然而,对于某些部分,我的截距是 10,而对于其他部分,我的截距是 1000,因此货币单位的增加没有帮助,因为每年 1 个单位的斜率对于 10 的截距和 1 个货币单位的斜率可能很大每年可以忽略不计。任何人都可以帮助解决这个问题,标准化 R 中的斜率,以便在用 lm
计算斜率后进行比较吗?
例子
data.ex <- data.frame(Age = c(c(1:10), c(1:10)),
Costs = c(11,12,13,14,15,12,17,18,19,20, 1001,1002,1003,1004,999,1006,1007,1008,1009,1010),
Type = c(rep("A", 10), rep("B", 10)))
pt <- ggplot(data = data.ex, aes(x=Age, y = Costs))+
geom_smooth(method="lm")+
facet_wrap(facets = "Type", nrow = 2)
plot(pt)
print(with(data.ex[data.ex$Type == "A", ], lm(Costs ~ Age)))
print(with(data.ex[data.ex$Type == "B", ], lm(Costs ~ Age)))
您可以尝试使用不同比例的 y 轴...但实际上我真的不明白您的问题。
Age = c(1:10)
Costs = c(11,12,13,14,15,12,17,18,19,20)
plot(Age, Costs, type="l", col="blue", lwd=2)
Costs = c(1001,1002,1003,1004,999,1006,1007,1008,1009,1010)
plot(Age, Costs, type="l", col="blue", lwd=2)
正如其他人所指出的,在 facet_wrap
中设置 scales = 'free'
将使两条线在图中更加明显。
关于你的另一个问题,你的措辞有点不清楚,但听起来你在说,"If baseline costs start at , then an increase of /year is substantial, whereas at a baseline cost of ,000, /year isn't significant. How do I show that difference?"
一种方法是根据其截距对每个组进行归一化:
library(dplyr)
# calculate intercepts for each group and extract them:
intercept.ex <- group_by(data.ex, Type) %>%
do(data.frame(intercept = coef(lm(Costs ~ Age, data = .))[1]))
# normalize the values in each group against their intercepts
data.ex <- merge(data.ex, intercept.ex) %>%
mutate(Costs = Costs / intercept)
# Age slope = 0.1002
print(with(data.ex[data.ex$Type == "A", ], lm(Costs ~ Age)))
# Age slope = 0.001037
print(with(data.ex[data.ex$Type == "B", ], lm(Costs ~ Age)))
我应该指出,这两个斜率仍然具有统计显着性,因为年龄和成本之间的关系非常清楚。但是相对效应大小它们之间有很大的不同。