在 Stata 中看似无关的回归太慢后的边距

Margins after seemingly unrelated regression too slow in Stata

我有一个300万的obs数据集。我需要用 SUR 估算 LPM,并获得边际效应。

我用了gsem... vce(cluster x),然后margins, ... force。但是需要很长时间才能得到利润率结果(超过 2 小时)。我确实需要 CI 的标准错误,所以我不能不使用 nose 选项。

还有其他方法可以提高速度吗?

确切的代码取决于您确切指的是哪种边际效应。您可以使用 lincom 计算部分效应,这很可能比 margins.

更快

举个例子,假设我们估计这个模型:

x1对y的偏影响可以通过对x1求偏导得到:

我们可以通过插上means得到x1对y在x2和x3的means上的效果。要在 Stata 中执行此操作:

// Get data
webuse regress

// Run the regression
qui reg y c.x1##c.(x2 x3)

// Get the sample means of x2 and x3 
sum x2 if e(sample), meanonly
scalar m_x2 = r(mean)
sum x3 if e(sample), meanonly
scalar m_x3 = r(mean)

// Calculate partial effect
lincom x1 + m_x2 * c.x1#c.x2 + m_x3*c.x1#c.x3

结果:

. lincom x1 + m_x2 * c.x1#c.x2 + m_x3*c.x1#c.x3

 ( 1)  x1 - .2972973*c.x1#c.x2 + 3019.459*c.x1#c.x3 = 0

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   1.409372   1.005254     1.40   0.163    -.5778255    3.396569
------------------------------------------------------------------------------

可以看到,这和margins得到的结果是一样的:

. qui reg y c.x1##c.(x2 x3)

. margins, dydx(x1) atmeans

Conditional marginal effects                    Number of obs     =        148
Model VCE    : OLS

Expression   : Linear prediction, predict()
dy/dx w.r.t. : x1
at           : x1              =    3.014865 (mean)
               x2              =   -.2972973 (mean)
               x3              =    3019.459 (mean)

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   1.409372   1.005254     1.40   0.163    -.5778256    3.396569
------------------------------------------------------------------------------

这是一个速度比较,显示在这种情况下,lincommargins 快 14 倍,有 300 万次观察:

clear
webuse regress
expand 20271

gen lincom = .
gen margins = .
qui reg y c.x1##c.(x2 x3)

forval i = 1/50 {

    timer clear
    
    timer on 1
    sum x2 if e(sample), meanonly
    scalar m_x2 = r(mean)
    sum x3 if e(sample), meanonly
    scalar m_x3 = r(mean)
    lincom x1 + m_x2 * c.x1#c.x2 + m_x3*c.x1#c.x3
    timer off 1

    timer on 2
    margins, dydx(x1) atmeans
    timer off 2
    
    timer list
    replace lincom = r(t1) in `i'
    replace margins = r(t2) in `i'
}

ttest lincom == margins
di "On average, lincom is " %4.2f `=r(mu_2) / r(mu_1)' " times faster than margins with `=_N' observations"
// On average, lincom is 13.88 times faster than margins with 3000108 observations