Stata：通过标签而不是数字来引用分类字段

Question

我正在尝试更有效地使用分类变量。

假设我有一个分类变量 phone，它具有以下值：

----------------------
    phone |      Freq.
----------+-----------  
Landline  |        223    
Mobile    |     49,297
     Both |      1,308

我想运行这样的命令：

sum x if phone == Mobile

为此，我需要运行以下三个命令：

// figure out what the label is called
. describe phone, full

              storage   display    value
variable name   type    format     label
-------------------------------------------
phone           byte    %15.0g     phone_label

// list the label so i can figure out what number goes with what category
. label list phone_label

phonel:
           1 Landline
           2 Mobile
           3 Both

// run the command with the numeric category identifier
. sum x if phone == 2

现在我的代码有一条晦涩难懂的行 phone == 2，除非他们执行上述相同步骤，否则其他用户将看不清楚。

有没有办法直接使用分类标识符 "Mobile" 而不是数字标识符 2？

Answer 1

有没有办法直接使用值标签（例如Mobile）而不是值本身？不是我知道的。

当您 define/assign 为分类数据添加值标签时，基础数据不会更改。在这种情况下，值标签只是对程序员的视觉辅助。

如果你想拥有上面的确切功能，你可以考虑重新编码你的变量以具有字符串值，但这似乎不是最好的方法：

decode phone, gen(phone_str)
summ if phone_str=="Mobile" //OK

另一种方法是修改您在上面使用的工作流程，以避免出现“模糊行 phone==2”的问题。更多程序化可能是：

label list `: value label phone'       // display label in one step
local mobile_value 2                   // save value of "Mobile"
summ x if phone==`mobile_value'        // clearly show you are cutting over mobiles

Answer 2

您可以 select 使用值标签进行观察。

. sysuse auto, clear
(1978 Automobile Data)

. count if foreign=="Foreign":origin
  22

您需要知道值标签的名称，此处 origin。您可以通过多种方式进行查找。

这在 Stata 14 的 [U] 13.11 和早期版本中（可能在不同的章节和节号下）有记录。另见 http://www.stata-journal.com/article.html?article=dm0009

Stata：通过标签而不是数字来引用分类字段

Stata: refer to categorical fields by their labels instead of their numbers

stata