R 预测包 VS Stata 利润率

R prediction package VS Stata margins

我正在从 Stata 切换到 R,当我使用预测计算边际 pred 和 Stata 命令 margins 修复 a 的值时,我发现结果不一致变量为 x。这是示例:

library(dplyr)
library(prediction)

d <- data.frame(x1 = factor(c(1,1,1,2,2,2), levels = c(1, 2)),
            x2 = factor(c(1,2,3,1,2,3), levels = c(1, 2, 3)),
            x3 = factor(c(1,2,1,2,1,2), levels = c(1, 2)),
            y = c(3.1, 2.8, 2.5, 4.3, 4.0, 3.5))

m2 <- lm(y ~ x1 + x2 + x3, d)
summary(m2)

marg2a <- prediction(m2, at = list(x2 = "1"))
marg2b <- prediction(m2, at = list(x1 = "1"))

marg2a %>%
  select(x1, fitted) %>%
  group_by(x1) %>%
  summarise(error = mean(fitted))

marg2b %>%
  select(x2, fitted) %>%
  group_by(x2) %>%
  summarise(error = mean(fitted))

这是结果:

# A tibble: 2 x 2
      x1    error
   <fctr>    <dbl>
1      1 3.133333
2      2 4.266667


# A tibble: 3 x 2
      x2 error
  <fctr> <dbl>
1      1 3.125
2      2 2.825
3      3 2.425

而如果我尝试使用 Stata 的边距复制它,结果如下:

regress y i.x1 i.x2 i.x3
margins i.x1, at(x2 == 1)
margins i.x2, at(x1 == 1)


------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |
          1  |      3.125   .0829157    37.69   0.017     2.071456    4.178544
          2  |      4.275   .0829157    51.56   0.012     3.221456    5.328544
------------------------------------------------------------------------------

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x2 |
          1  |      3.125   .0829157    37.69   0.017     2.071456    4.178544
          2  |      2.825   .0829157    34.07   0.019     1.771456    3.878544
          3  |      2.425   .0829157    29.25   0.022     1.371456    3.478544
------------------------------------------------------------------------------

x2 的边距在 R 和 Stata 中是相同的,但是当涉及到 x1 时却有差异,我不知道为什么。非常感谢任何帮助。谢谢,

P

你的 Stata 和 R 代码不等价。要复制您需要的 Stata 代码:

> prediction(m2, at = list(x1 = c("1", "2"), x2 = "1"))
Average predictions for 6 observations:
 at(x1) at(x2) value
      1      1 3.125
      2      1 4.275
> prediction(m2, at = list(x2 = c("1", "2", "3"), x1 = "1"))
Average predictions for 6 observations:
 at(x2) at(x1) value
      1      1 3.125
      2      1 2.825
      3      1 2.425

那是因为当您说 margins i.x1 时,您要求预测数据集的反事实版本,其中 x1 被 1 替换,然后被 2 替换,附加约束在两个反事实 x2 保持在 1。在您的第二个 Stata 示例中发生了同样的事情。

这是由于 Stata 的 margins 命令有歧义,或者更确切地说是两个获得相同输出的句法表达式。一个是您的代码:

. margins i.x1, at(x2 == 1)

Predictive margins                              Number of obs     =          6
Model VCE    : OLS

Expression   : Linear prediction, predict()
at           : x2              =           1

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |
          1  |      3.125   .0829156    37.69   0.017     2.071457    4.178543
          2  |      4.275   .0829156    51.56   0.012     3.221457    5.328543
------------------------------------------------------------------------------

另一个更明确地说明了上面实际发生的事情:

. margins, at(x1 = (1 2) x2 == 1)

Predictive margins                              Number of obs     =          6
Model VCE    : OLS

Expression   : Linear prediction, predict()

1._at        : x1              =           1
               x2              =           1

2._at        : x1              =           2
               x2              =           1

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |      3.125   .0829156    37.69   0.017     2.071457    4.178543
          2  |      4.275   .0829156    51.56   0.012     3.221457    5.328543
------------------------------------------------------------------------------