具有交互作用的线性回归的聚合系数 [r]
Aggregate coeffcients from linear regression with interactions [r]
我正在尝试编写一个函数,该函数将采用线性回归模型、设计矩阵 X 并将 return 系数的边际效应。
边际效应在回归的汇总输出中给出。但是,如果我们的模型中存在交互作用,我们必须汇总系数。
我在R里准备了一个小例子:
library(tidyverse)
# Desing Matrix for regression with interaction ---------------------------
X <- model.matrix(hp ~ factor(gear) + disp + mpg + wt + wt:factor(gear), data = mtcars)
y <- mtcars$hp
lm(y ~ X - 1, data = mtcars) %>%
summary()
# Model Interpretation ----------------------------------------------------
# Our reference category is 'Gear3'.
# Meaning that X(Intercept) is an intercept for Gear3
# If I want an intercept for Gear4 and Gear5 coefs Gear4 and Gear5 are deviations from refrence cathegory
# So Intercept for Gear4 is: X(Intercept) + Xfactor(gear)4 = 166.0265 + 30.2619 = 196.2884
# Same Idea hold for Gear5: X(Intercept) + Xfactor(gear)5 = 166.0265 -58.2724 = 107.7541
# Now if we want to interpret wt as a marginal affect, again we need to take into account our created interactions.
# Xwt is marginal effect of wt for the regrence cathegory (Gear3)
# If we want marginal Affect for (Gear4 cars) we need: Xwt + Xfactor(gear)4:wt = -6.4985 -9.8580 = -16.3565
# SAme Idea for gear5: Xwt + Xfactor(gear)5:wt = -6.4985 + 49.6390 = 43.1405
I would like to write a function that would take model object and desing matrix and would return a dataframe with our aggregated marginal effects:
万一我想要一个函数来 return 一个数据框:
# X(Intercept)gear3: 166.0265
# X(Intercept)gear4: 196.2884
# X(Intercept)gear5: 107.7541
#
# Xwtgear3:-6.4985
# Xwtgear4:-16.3565
# Xwtgear5:43.1405
对于用户而言,交互的边际效应已经 precalculated/aggregated。
到目前为止,我能够创建一些还不够概括的概念。
因此,我想询问如何使用 dplyr
或 datatable
编写函数的任何想法。
我不确定你是对输出感兴趣还是对自己做的方式感兴趣。
关于前者,您可以使用 {interactions} 包,它会给您想要的输出。关于后者,您可以查看 {interactions} 程序包如何在幕后计算此输出。
library(dplyr)
library(interactions)
mtcars2 <- mtcars %>% mutate(gear = factor(gear))
fit <- lm(hp ~ wt * gear + disp + mpg, data = mtcars2)
slopes <- sim_slopes(fit, pred = wt, modx = gear, centered = "none")
#> Warning: Johnson-Neyman intervals are not available for factor moderators.
slopes$ints
#> Value of gear Est. S.E. 2.5% 97.5% t val. p
#> 1 5 107.7540 103.74667 -106.36856 321.8766 1.038626 0.30932987
#> 2 4 196.2884 100.91523 -11.99042 404.5672 1.945082 0.06357154
#> 3 3 166.0265 71.71678 18.01029 314.0426 2.315029 0.02947996
slopes$slopes
#> Value of gear Est. S.E. 2.5% 97.5% t val. p
#> 1 5 43.140490 26.69021 -11.94539 98.22637 1.6163415 0.1190910
#> 2 4 -16.356572 20.22409 -58.09705 25.38390 -0.8087667 0.4265945
#> 3 3 -6.498535 15.37599 -38.23302 25.23595 -0.4226417 0.6763194
由 reprex package (v0.3.0)
于 2021-01-03 创建
如果您希望两个结果合而为一 data.frame
您可以轻松地将它们绑定在一起:
bind_rows(mutate(slopes$ints, term = "(Intercept)"),
mutate(slopes$slopes, term = "(Interaction)")) %>%
select(term, "Value of gear", Est., p)
#> term Value of gear Est. p
#> 1 (Intercept) 5 107.754033 0.30932987
#> 2 (Intercept) 4 196.288380 0.06357154
#> 3 (Intercept) 3 166.026451 0.02947996
#> 4 (Interaction) 5 43.140490 0.11909097
#> 5 (Interaction) 4 -16.356572 0.42659451
#> 6 (Interaction) 3 -6.498535 0.67631941
我正在尝试编写一个函数,该函数将采用线性回归模型、设计矩阵 X 并将 return 系数的边际效应。
边际效应在回归的汇总输出中给出。但是,如果我们的模型中存在交互作用,我们必须汇总系数。
我在R里准备了一个小例子:
library(tidyverse)
# Desing Matrix for regression with interaction ---------------------------
X <- model.matrix(hp ~ factor(gear) + disp + mpg + wt + wt:factor(gear), data = mtcars)
y <- mtcars$hp
lm(y ~ X - 1, data = mtcars) %>%
summary()
# Model Interpretation ----------------------------------------------------
# Our reference category is 'Gear3'.
# Meaning that X(Intercept) is an intercept for Gear3
# If I want an intercept for Gear4 and Gear5 coefs Gear4 and Gear5 are deviations from refrence cathegory
# So Intercept for Gear4 is: X(Intercept) + Xfactor(gear)4 = 166.0265 + 30.2619 = 196.2884
# Same Idea hold for Gear5: X(Intercept) + Xfactor(gear)5 = 166.0265 -58.2724 = 107.7541
# Now if we want to interpret wt as a marginal affect, again we need to take into account our created interactions.
# Xwt is marginal effect of wt for the regrence cathegory (Gear3)
# If we want marginal Affect for (Gear4 cars) we need: Xwt + Xfactor(gear)4:wt = -6.4985 -9.8580 = -16.3565
# SAme Idea for gear5: Xwt + Xfactor(gear)5:wt = -6.4985 + 49.6390 = 43.1405
I would like to write a function that would take model object and desing matrix and would return a dataframe with our aggregated marginal effects:
万一我想要一个函数来 return 一个数据框:
# X(Intercept)gear3: 166.0265
# X(Intercept)gear4: 196.2884
# X(Intercept)gear5: 107.7541
#
# Xwtgear3:-6.4985
# Xwtgear4:-16.3565
# Xwtgear5:43.1405
对于用户而言,交互的边际效应已经 precalculated/aggregated。 到目前为止,我能够创建一些还不够概括的概念。
因此,我想询问如何使用 dplyr
或 datatable
编写函数的任何想法。
我不确定你是对输出感兴趣还是对自己做的方式感兴趣。
关于前者,您可以使用 {interactions} 包,它会给您想要的输出。关于后者,您可以查看 {interactions} 程序包如何在幕后计算此输出。
library(dplyr)
library(interactions)
mtcars2 <- mtcars %>% mutate(gear = factor(gear))
fit <- lm(hp ~ wt * gear + disp + mpg, data = mtcars2)
slopes <- sim_slopes(fit, pred = wt, modx = gear, centered = "none")
#> Warning: Johnson-Neyman intervals are not available for factor moderators.
slopes$ints
#> Value of gear Est. S.E. 2.5% 97.5% t val. p
#> 1 5 107.7540 103.74667 -106.36856 321.8766 1.038626 0.30932987
#> 2 4 196.2884 100.91523 -11.99042 404.5672 1.945082 0.06357154
#> 3 3 166.0265 71.71678 18.01029 314.0426 2.315029 0.02947996
slopes$slopes
#> Value of gear Est. S.E. 2.5% 97.5% t val. p
#> 1 5 43.140490 26.69021 -11.94539 98.22637 1.6163415 0.1190910
#> 2 4 -16.356572 20.22409 -58.09705 25.38390 -0.8087667 0.4265945
#> 3 3 -6.498535 15.37599 -38.23302 25.23595 -0.4226417 0.6763194
由 reprex package (v0.3.0)
于 2021-01-03 创建如果您希望两个结果合而为一 data.frame
您可以轻松地将它们绑定在一起:
bind_rows(mutate(slopes$ints, term = "(Intercept)"),
mutate(slopes$slopes, term = "(Interaction)")) %>%
select(term, "Value of gear", Est., p)
#> term Value of gear Est. p
#> 1 (Intercept) 5 107.754033 0.30932987
#> 2 (Intercept) 4 196.288380 0.06357154
#> 3 (Intercept) 3 166.026451 0.02947996
#> 4 (Interaction) 5 43.140490 0.11909097
#> 5 (Interaction) 4 -16.356572 0.42659451
#> 6 (Interaction) 3 -6.498535 0.67631941