lmPerm::lmp(y~x*f,center=TRUE) vs lm(y~x*f):非常不同的系数

lmPerm::lmp(y~x*f,center=TRUE) vs lm(y~x*f): very different coefficients

同时

lmp(y~x, center=TRUE,perm="Prob")
lm(y~x)

对于 xy 是定量变量给出了类似的结果,

lmp(y~x*f, center=TRUE,perm="Prob")
lm(y~x*f)

不同,其中 f 是因子变量。

require(lmPerm)
## Test data
x <- 1:1000
set.seed(1000)
y1 <- x*2+runif(1000,-100,100)
y1 <- y1+min(y1)
y2 <- 0.75*y1 + abs(rnorm(1000,50,10))
datos <- data.frame(x =c(x,x),y=c(y1,y2),tipo=factor(c(rep("A",1000),rep("B",1000))))

那么不出所料,

coefficients(lmp(y~x,perm="Prob",data=datos,center=FALSE))
# [1] "Settings:  unique SS "
# (Intercept)           x 
#   -37.69542     1.74498 

coefficients(lm(y~x,data=datos))
# (Intercept)           x 
#   -37.69542     1.74498 

但是

fit.lmp <- lmp(y~x*tipo,perm="Prob",data=datos,center=FALSE)
fit.lm  <- lm(y~x*tipo, data=datos)

coefficients(fit.lm)
# (Intercept)           x       tipoB     x:tipoB 
# -71.1696395   1.9933827  66.9484438  -0.4968049 

coefficients(fit.lmp)
# (Intercept)           x       tipo1     x:tipo1 
# -37.6954176   1.7449803 -33.4742219   0.2484024 

我理解 lm() 的系数:

coefficients(fit.lm)[1:2] # coefficients for Level A
# (Intercept)           x 
# -71.169640    1.993383 

coefficients(fit.lm)[1:2] + coefficients(fit.lm)[3:4] # coefficients for Level B
# (Intercept)           x 
#   -4.221196    1.496578 

对应

contrasts(datos$tipo)
#  B
#A 0
#B 1
#attributes(fit.lm$qr$qr)$contrasts
#$tipo
#[1] "contr.treatment"

但不是 lmp():

coefficients(fit.lmp)[1:2] + coefficients(fit.lmp)[3:4] # coefficients for Level A
# (Intercept)           x 
# -71.169640    1.993383 

coefficients(fit.lmp)[1:2] - coefficients(fit.lmp)[3:4] # coefficients for Level B
# (Intercept)           x 
#  -4.221196    1.496578 

为什么?

lmp 正在应用 contr.sum 而不是 contr.treatment。您可以通过以下方式获得相同的 lm 结果:

lm(y~x*tipo, data=datos, contrasts = list(tipo = "contr.sum"))
#Coefficients:
#(Intercept)            x        tipo1      x:tipo1  
#   -37.6954       1.7450     -33.4742       0.2484