GLM 之后的多重比较,包括交互项
Multiple comparisons after GLM including interaction terms
我正在分析参与者对某些问题的二元答案数据集。我正在使用 glm
函数来测试 Var * Base_con
如何影响 Dec
的结果。拟合后,我试图比较 "Var" 因素如何影响每个 "Base_con" 因素水平内的结果。在 this vignette 之后,我尝试了以下(失败的)方法,我相信它可以重现(如果它不起作用请告诉我):
# load example dataset with relevant columns
require(RCurl)
my_csv = getURL("https://docs.google.com/spreadsheets/d/1sBVW7QbnfumeRmY1uEDdiDNJE7QfmCXH0wMmV2lZSH4/pub?gid=0&single=true&output=csv")
eg_data = read.csv(textConnection(my_csv))
# set columns as factors because they are numerically coded
eg_data$Base_con = as.factor(eg_data$Base_con)
eg_data$Var = as.factor(eg_data$Var)
eg_data$Dec = as.factor(eg_data$Dec)
# GLM fit
m1 = glm(Dec ~ Var * Base_con, data = eg_data, family = "binomial")
# strategy for Tukey multiple comparisons
require(multcomp)
tmp = expand.grid(Base_con = unique(eg_data$Base_con), Var = unique(eg_data$Var))
X = model.matrix(~Base_con : Var, data = tmp)
mc = glht(m1, linfct = X)
最后一个命令的输出是:
Error in glht.matrix(m1, linfct = X) :
‘ncol(linfct)’ is not equal to ‘length(coef(model))’
实际上报错的两个元素的列数和长度是不一样的:
> ncol(X)
[1] 7
> length(coef(m1))
[1] 6
这就是我到目前为止所能取得的进展。有任何想法吗?感谢大家。
如果您的 objective 是比较 Base_con
和 Var
相互作用的效果,更直接的方法可能是为这些相互作用创建一个新术语(这里我已经将数据框名称更改为 d
):
d$BV <- interaction(d$Base_con, d$Var)
然后拟合模型并进行比较:
# GLM fit
m1 <- glm(Dec ~ -1 + BV, data = d, family = "binomial")
library(multcomp)
summary(glht(m1, linfct = mcp(BV = "Tukey")))
# Simultaneous Tests for General Linear Hypotheses
# Multiple Comparisons of Means: Tukey Contrasts
# Fit: glm(formula = Dec ~ -1 + BV, family = "binomial", data = d)
# Linear Hypotheses:
# Estimate Std. Error z value Pr(>|z|)
# 2.0 - 1.0 == 0 -1.7988 0.4632 -3.883 0.00133 **
# 3.0 - 1.0 == 0 -4.9702 0.4846 -10.255 < 0.001 ***
# 1.1 - 1.0 == 0 -1.6596 0.4523 -3.669 0.00308 **
# 2.1 - 1.0 == 0 -3.0593 0.4392 -6.965 < 0.001 ***
# 3.1 - 1.0 == 0 -5.3893 0.4759 -11.325 < 0.001 ***
# 3.0 - 2.0 == 0 -3.1714 0.3190 -9.941 < 0.001 ***
# 1.1 - 2.0 == 0 0.1392 0.2673 0.521 0.99498
# 2.1 - 2.0 == 0 -1.2605 0.2446 -5.154 < 0.001 ***
# 3.1 - 2.0 == 0 -3.5905 0.3055 -11.751 < 0.001 ***
# 1.1 - 3.0 == 0 3.3106 0.3029 10.930 < 0.001 ***
# 2.1 - 3.0 == 0 1.9109 0.2830 6.751 < 0.001 ***
# 3.1 - 3.0 == 0 -0.4191 0.3371 -1.243 0.80488
# 2.1 - 1.1 == 0 -1.3997 0.2231 -6.273 < 0.001 ***
# 3.1 - 1.1 == 0 -3.7297 0.2887 -12.920 < 0.001 ***
# 3.1 - 2.1 == 0 -2.3300 0.2678 -8.702 < 0.001 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# (Adjusted p values reported -- single-step method)
我正在分析参与者对某些问题的二元答案数据集。我正在使用 glm
函数来测试 Var * Base_con
如何影响 Dec
的结果。拟合后,我试图比较 "Var" 因素如何影响每个 "Base_con" 因素水平内的结果。在 this vignette 之后,我尝试了以下(失败的)方法,我相信它可以重现(如果它不起作用请告诉我):
# load example dataset with relevant columns
require(RCurl)
my_csv = getURL("https://docs.google.com/spreadsheets/d/1sBVW7QbnfumeRmY1uEDdiDNJE7QfmCXH0wMmV2lZSH4/pub?gid=0&single=true&output=csv")
eg_data = read.csv(textConnection(my_csv))
# set columns as factors because they are numerically coded
eg_data$Base_con = as.factor(eg_data$Base_con)
eg_data$Var = as.factor(eg_data$Var)
eg_data$Dec = as.factor(eg_data$Dec)
# GLM fit
m1 = glm(Dec ~ Var * Base_con, data = eg_data, family = "binomial")
# strategy for Tukey multiple comparisons
require(multcomp)
tmp = expand.grid(Base_con = unique(eg_data$Base_con), Var = unique(eg_data$Var))
X = model.matrix(~Base_con : Var, data = tmp)
mc = glht(m1, linfct = X)
最后一个命令的输出是:
Error in glht.matrix(m1, linfct = X) :
‘ncol(linfct)’ is not equal to ‘length(coef(model))’
实际上报错的两个元素的列数和长度是不一样的:
> ncol(X)
[1] 7
> length(coef(m1))
[1] 6
这就是我到目前为止所能取得的进展。有任何想法吗?感谢大家。
如果您的 objective 是比较 Base_con
和 Var
相互作用的效果,更直接的方法可能是为这些相互作用创建一个新术语(这里我已经将数据框名称更改为 d
):
d$BV <- interaction(d$Base_con, d$Var)
然后拟合模型并进行比较:
# GLM fit
m1 <- glm(Dec ~ -1 + BV, data = d, family = "binomial")
library(multcomp)
summary(glht(m1, linfct = mcp(BV = "Tukey")))
# Simultaneous Tests for General Linear Hypotheses
# Multiple Comparisons of Means: Tukey Contrasts
# Fit: glm(formula = Dec ~ -1 + BV, family = "binomial", data = d)
# Linear Hypotheses:
# Estimate Std. Error z value Pr(>|z|)
# 2.0 - 1.0 == 0 -1.7988 0.4632 -3.883 0.00133 **
# 3.0 - 1.0 == 0 -4.9702 0.4846 -10.255 < 0.001 ***
# 1.1 - 1.0 == 0 -1.6596 0.4523 -3.669 0.00308 **
# 2.1 - 1.0 == 0 -3.0593 0.4392 -6.965 < 0.001 ***
# 3.1 - 1.0 == 0 -5.3893 0.4759 -11.325 < 0.001 ***
# 3.0 - 2.0 == 0 -3.1714 0.3190 -9.941 < 0.001 ***
# 1.1 - 2.0 == 0 0.1392 0.2673 0.521 0.99498
# 2.1 - 2.0 == 0 -1.2605 0.2446 -5.154 < 0.001 ***
# 3.1 - 2.0 == 0 -3.5905 0.3055 -11.751 < 0.001 ***
# 1.1 - 3.0 == 0 3.3106 0.3029 10.930 < 0.001 ***
# 2.1 - 3.0 == 0 1.9109 0.2830 6.751 < 0.001 ***
# 3.1 - 3.0 == 0 -0.4191 0.3371 -1.243 0.80488
# 2.1 - 1.1 == 0 -1.3997 0.2231 -6.273 < 0.001 ***
# 3.1 - 1.1 == 0 -3.7297 0.2887 -12.920 < 0.001 ***
# 3.1 - 2.1 == 0 -2.3300 0.2678 -8.702 < 0.001 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# (Adjusted p values reported -- single-step method)