混合线性模型

Mixed Linear Models

我正在尝试使用以下数据创建混合线性模型。我试图从 alcdep 预测赌博,协变量年龄和性别。我正在尝试使用 python 中的统计模型,但我不确定如何去做。

到目前为止我已经尝试过: md = smf.mixedlm("acldep ~ Gambling", data, groups=data["Gambling"])

但我不断出错,我不知道如何使用这种方式指定协变量。

这里是数据的头部:

{'IID': {0: 'Yale_0001', 1: 'Yale_0004', 2: 'Yale_0006', 3: 'Yale_0007', 4: 'Yale_0008'}, 'SEX': {0: 2, 1: 1, 2: 2, 3: 1, 4: 1}, 'AGE': {0: 27, 1: 39, 2: 41, 3: 45, 4: 44}, 'alcdep': {0: 2, 1: 2, 2: 2, 3: 2, 4: 2}, 'Gambling': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1}, 'Zero': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0}, 'Yes': {0: 'Yes', 1: 'Yes', 2: 'Yes', 3: 'Yes', 4: 'Yes'}, 'PRS': {0: 0.053486584299999994, 1: 0.0304387435, 2: 0.00917773968, 3: 0.016352741100000002, 4: 7.433452840000001e-05}}

我稍微修改了你的数据,因为你给出的结果是奇异矩阵。

你快到了,忘记了一些东西。所以,有了这个数据:

data = {'IID': {0: 'Yale_0001', 1: 'Yale_0004', 2: 'Yale_0006', 3: 'Yale_0007', 4: 'Yale_0008'}, 'SEX': {0: 2, 1: 1, 2: 2, 3: 1, 4: 1}, 'AGE': {0: 27, 1: 39, 2: 41, 3: 45, 4: 44}, 'alcdep': {0: 2, 1: 2, 2: 2, 3: 1, 4: 1}, 'Gambling': {0: 1, 1: 1, 2: 2, 3: 1, 4: 2}, 'Zero': {0: 0, 1: 0, 2: 0, 3: 1, 4: 0}, 'Yes': {0: 'Yes', 1: 'Yes', 2: 'Yes', 3: 'Yes', 4: 'Yes'}, 'PRS': {0: 0.053486584299999994, 1: 0.0304387435, 2: 0.00917773968, 3: 0.016352741100000002, 4: 7.433452840000001e-05}}

你可以这样做:

import researchpy as rp
import statsmodels.api as sm
import scipy.stats as stats
import statsmodels.formula.api as smf
md = smf.mixedlm("alcdep ~ Gambling",groups="Gambling",data = df).fit()

md.summary()

给出:

      Mixed Linear Model Regression Results
=======================================================
Model:              MixedLM Dependent Variable: alcdep 
No. Observations:   5       Method:             REML   
No. Groups:         2       Scale:              0.3889 
Min. group size:    2       Log-Likelihood:     -3.7360
Max. group size:    3       Converged:          Yes    
Mean group size:    2.5                                
-------------------------------------------------------
             Coef.  Std.Err.   z    P>|z| [0.025 0.975]
-------------------------------------------------------
Intercept     1.833    1.630  1.125 0.261 -1.362  5.028
Gambling     -0.167    1.050 -0.159 0.874 -2.224  1.891
Gambling Var  0.389                                    
=======================================================

要处理自变量,比如 SEX,

md = smf.mixedlm("alcdep ~ Gambling+C(SEX)",groups="Gambling",data = df).fit()

md.summary()

给出:

     Mixed Linear Model Regression Results
=======================================================
Model:              MixedLM Dependent Variable: alcdep 
No. Observations:   5       Method:             REML   
No. Groups:         2       Scale:              0.2857 
Min. group size:    2       Log-Likelihood:     -2.5581
Max. group size:    3       Converged:          Yes    
Mean group size:    2.5                                
-------------------------------------------------------
             Coef.  Std.Err.   z    P>|z| [0.025 0.975]
-------------------------------------------------------
Intercept     1.714    1.400  1.225 0.221 -1.029  4.458
C(SEX)[T.2]   0.714    0.495  1.443 0.149 -0.256  1.684
Gambling     -0.286    0.904 -0.316 0.752 -2.057  1.485
Gambling Var  0.286                                    
=======================================================