按组分析并从变量中提取值以进行打印

Question

我正在做一些分组处理，运行一些学校回归。

我想做的是稍微自定义我的输出，这样我就可以看到哪个输出属于哪个学校。但是，我似乎无法以这种方式处理 foreach 或 forvalues 以使其工作。我已经尝试了 foreach 和 forvalues 的各种迭代并取得了一些成功。

我现在一直想做的是调用 schoolname 的值显示在 di 行上，但我没有成功。

您可以在下面找到一些虚拟数据和代码：

clear

input str6 studyid str30 schoolname y1gpa hsgpa a b c
VR2330  "Hot Dog University"  3.88696869  3.128562923 212.2027076 198.6369561 201.8520712
VR2330  "Hot Dog University"  3.724999751 4.14927266  200.2249981 197.2148641 190.8007133
VR2330  "Hot Dog University"  2.862368864 2.739375205 177.8104087 178.3566674 200.670764
VR2330  "Hot Dog University"  2.944155173 3.449033253 246.0577836 217.0256571 201.3599989
VR2330  "Hot Dog University"  3.027040023 2.774194849 179.7717585 208.3190507 201.1944748
VR2330  "Hot Dog University"  2.841508367 3.687799575 197.6369809 195.8034033 199.1525982
VR2330  "Hot Dog University"  2.709707669 2.258620921 147.0523958 247.4690088 215.5400833
VR2331  "The Berger Institute"  3.212822292 2.185146375 198.8197157 225.2337787 210.4646972
VR2331  "The Berger Institute"  2.304060034 2.241674897 188.3421993 186.9284032 207.108407
VR2331  "The Berger Institute"  3.339541832 3.106312279 209.7122346 193.2738859 207.9925428
VR2331  "The Berger Institute"  2.499369421 3.664498982 221.7819609 176.6067578 193.6349191
VR2331  "The Berger Institute"  2.58976085  2.604897762 201.4597068 189.7504268 193.0684748
VR2331  "The Berger Institute"  3.077416948 3.084996384 238.0112743 193.6023413 200.5245392
VR2331  "The Berger Institute"  3.595215292 3.47498973  196.0401919 205.2955727 204.7250124
VR2331  "The Berger Institute"  3.24943739  2.771259619 191.88872 179.8274715 210.3563047
VY1444  "Kale University" 3.58066891  3.765540136 185.2309378 198.1122011 196.1956994
VY1444  "Kale University" 2.620232242 3.079285234 163.3202145 195.7290603 205.682183
VY1444  "Kale University" 3.022673799 2.9914787 185.7449451 210.568389  206.960721
VY1444  "Kale University" 2.792861825 2.16564107  180.1691308 211.4182189 188.3452234
VY1444  "Kale University" 2.779154097 3.293620836 219.2595568 200.1849757 210.6425208
VY1444  "Kale University" 4.186316759 3.456717239 228.7297482 194.2097571 205.7079995
VY1444  "Kale University" 4.379739444 2.859316959 213.5641419 199.1315086 208.4406278
VY1444  "Kale University" 1.966028458 2.54365722  220.7757803 195.4262537 228.8124132
VY1444  "Kale University" 2.008067935 2.795116509 199.3403281 200.4161464 188.9522367
VZ4189  "Rice"  3.258253963 3.619015176 181.1053119 222.2819107 210.8807028
VZ4189  "Rice"  3.47515332  2.66431201  195.6496183 174.7512574 200.9326979
VZ4189  "Rice"  3.397466557 3.701428367 176.8322852 170.4327733 197.481968
VZ4189  "Rice"  3.141235215 3.26033076  187.7110626 187.5184942 215.002884
VZ4189  "Rice"  2.532078344 3.642275074 160.3208923 183.584604  194.770921
VZ4189  "Rice"  3.568638147 3.388113378 204.7815867 240.7565031 215.1194944
VZ4189  "Rice"  2.189863527 3.047948811 234.8225538 234.0024598 207.1882718
VZ4189  "Rice"  3.095726852 2.661160872 204.4226312 203.9618803 204.3683427
VZ4189  "Rice"  3.616748385 2.879665788 193.8070183 214.8352585 199.9727215
end

encode studyid, generate(school_id)
encode schoolname, generate(school_name)

sort school_id
egen _school = group(school_id)

tab1 _school schoolname
su _school, meanonly

forvalues _school = 1/`r(max)' { 

 di _n _dup(5)
 di "(start of analysis for `_school' ) "  _dup(60) "-" 
 di "I would like to have the actual ``schoolname`` here"

 regress y1gpa   hsgpa   if school_name == `_school'
 estimates store _mr

 regress y1gpa   hsgpa a b c   if school_name == `_school'
 estimates store _mf

 lrtest _mf _mr
 ftest _mf _mr

 test a b c 

}

注意：此问题也已在 Statalist

上交叉发布

Answer 1

关于 Statalist 的一些很好的答案（特别是使用 levelsof），但要提供一个使用您的方法并进行一些小调整的解决方案：

首先，您不需要同时生成 school_name 和 _school，因为它们是相同的。

其次，您需要将 r(max) 存储在 local 中并在 forvalues 循环中使用它（而不是 `r(max)'，这是无效的）。（或者如评论中指出的那样，使用 `=r(max)' 计算 r(max) 并直接插入结果。）

第三，您可以使用扩展宏函数获取school_name的值标签并显示它（参见help extended_fcn）。

encode studyid, generate(school_id)
encode schoolname, generate(school_name)
sort school_id

su school_name, meanonly
local nschool = r(max)

forvalues s = 1/`nschool' { 

    local sch : label school_name `s'
    di _n _dup(5)
    di as result "(start of analysis for `sch' )"  _dup(60) "-" 

    regress y1gpa   hsgpa   if school_name == `s'
    estimates store _mr

    regress y1gpa   hsgpa a b c   if school_name == `s'
    estimates store _mf

    lrtest _mf _mr
    //ftest _mf _mr

    test a b c 

}

按组分析并从变量中提取值以进行打印

Analysis by group and extracting value from a variable for printing

foreach

stata

stata-macros