顺便说一句,寻找差异。 R中的线性回归线
Looking for differences btw. linear regression lines in R
我想弄清楚如何比较线性回归(线)以检查这些回归的斜率是否存在显着差异。
我在谷歌上进行了大量搜索,但无法解决。我将不胜感激。
这是最小的工作示例:
# making a dataframe
x <- c(13.5, 2.8, 10.1, 5.8, 6.4, 12.5, 3.2, 8.9, 13.0)
y <- c(1.2, 3.2, 0.2, 1.9, 2.5, 0.6, 2.0, 0.4, 1.3)
z <- c("A","A","A","B","B","B","C","C","C")
df = data.frame(x, y, z)
df
# x y z
# 1 13.5 1.2 A
# 2 2.8 3.2 A
# 3 10.1 0.2 A
# 4 5.8 1.9 B
# 5 6.4 2.5 B
# 6 12.5 0.6 B
# 7 3.2 2.0 C
# 8 8.9 0.4 C
# 9 13.0 1.3 C
###############################################
# plotting the data
plot(log10(x) ~ log10(y),
col=c("blue","red","green")[unclass(z)], data = df,
pch=20)
###############################################
# subsetting the data for making the regrssion lines
df.A <- subset(df, z == "A")
df.B <- subset(df, z == "B")
df.C <- subset(df, z == "C")
###############################################
# making and drawing the regression line for A
res.A=lm(log10(df.A$x) ~log10(df.A$y))
res.A
# Call:
# lm(formula = log10(df.A$x) ~ log10(df.A$y))
#
# Coefficients:
# (Intercept) log10(df.A$y)
# 0.8458 -0.3862
# drawing the regression line A
abline(res.A, col= "blue", lty = 2)
###############################################
# making and drawing the regression line for B
res.B=lm(log10(df.B$x) ~log10(df.B$y))
res.B
# Call:
# lm(formula = log10(df.B$x) ~ log10(df.B$y))
#
# Coefficients:
# (Intercept) log10(df.B$y)
# 0.9688 -0.5271
#drawing the regression line B
abline(res.B, col= "red", lty = 2)
###############################################
# making and drawing the regression line for C
res.C=lm(log10(df.C$x) ~log10(df.C$y))
res.C
# Call:
# lm(formula = log10(df.C$x) ~ log10(df.C$y))
#
# Coefficients:
# (Intercept) log10(df.C$y)
# 0.8586 -0.4330
#drawing the regression line C
abline(res.C, col= "green", lty = 2)
谢谢!
使用 anova
比较具有 1 个斜率和 3 个截距的模型与具有 3 个斜率和 3 个截距的模型:
fm1 <- lm(log10(y) ~ z + log10(x), df)
fm3 <- lm(log10(y) ~ z + log10(x)/z, df)
anova(fm1, fm3)
给予:
Analysis of Variance Table
Model 1: log10(y) ~ z + log10(x)
Model 2: log10(y) ~ z + log10(x)/z
Res.Df RSS Df Sum of Sq F Pr(>F)
1 5 0.72132
2 3 0.64793 2 0.073387 0.1699 0.8513
所以斜率之间的差异并不显着。
请注意,我们只有少量数据,因此差异必须足够大才能让这个小集合发现这些差异显着。
我想弄清楚如何比较线性回归(线)以检查这些回归的斜率是否存在显着差异。
我在谷歌上进行了大量搜索,但无法解决。我将不胜感激。
这是最小的工作示例:
# making a dataframe
x <- c(13.5, 2.8, 10.1, 5.8, 6.4, 12.5, 3.2, 8.9, 13.0)
y <- c(1.2, 3.2, 0.2, 1.9, 2.5, 0.6, 2.0, 0.4, 1.3)
z <- c("A","A","A","B","B","B","C","C","C")
df = data.frame(x, y, z)
df
# x y z
# 1 13.5 1.2 A
# 2 2.8 3.2 A
# 3 10.1 0.2 A
# 4 5.8 1.9 B
# 5 6.4 2.5 B
# 6 12.5 0.6 B
# 7 3.2 2.0 C
# 8 8.9 0.4 C
# 9 13.0 1.3 C
###############################################
# plotting the data
plot(log10(x) ~ log10(y),
col=c("blue","red","green")[unclass(z)], data = df,
pch=20)
###############################################
# subsetting the data for making the regrssion lines
df.A <- subset(df, z == "A")
df.B <- subset(df, z == "B")
df.C <- subset(df, z == "C")
###############################################
# making and drawing the regression line for A
res.A=lm(log10(df.A$x) ~log10(df.A$y))
res.A
# Call:
# lm(formula = log10(df.A$x) ~ log10(df.A$y))
#
# Coefficients:
# (Intercept) log10(df.A$y)
# 0.8458 -0.3862
# drawing the regression line A
abline(res.A, col= "blue", lty = 2)
###############################################
# making and drawing the regression line for B
res.B=lm(log10(df.B$x) ~log10(df.B$y))
res.B
# Call:
# lm(formula = log10(df.B$x) ~ log10(df.B$y))
#
# Coefficients:
# (Intercept) log10(df.B$y)
# 0.9688 -0.5271
#drawing the regression line B
abline(res.B, col= "red", lty = 2)
###############################################
# making and drawing the regression line for C
res.C=lm(log10(df.C$x) ~log10(df.C$y))
res.C
# Call:
# lm(formula = log10(df.C$x) ~ log10(df.C$y))
#
# Coefficients:
# (Intercept) log10(df.C$y)
# 0.8586 -0.4330
#drawing the regression line C
abline(res.C, col= "green", lty = 2)
谢谢!
使用 anova
比较具有 1 个斜率和 3 个截距的模型与具有 3 个斜率和 3 个截距的模型:
fm1 <- lm(log10(y) ~ z + log10(x), df)
fm3 <- lm(log10(y) ~ z + log10(x)/z, df)
anova(fm1, fm3)
给予:
Analysis of Variance Table
Model 1: log10(y) ~ z + log10(x)
Model 2: log10(y) ~ z + log10(x)/z
Res.Df RSS Df Sum of Sq F Pr(>F)
1 5 0.72132
2 3 0.64793 2 0.073387 0.1699 0.8513
所以斜率之间的差异并不显着。
请注意,我们只有少量数据,因此差异必须足够大才能让这个小集合发现这些差异显着。