有没有一种简单的方法可以从 Stata 中以逗号或 space+逗号分隔的本地宏中提取 N 个第一个单词?
Is there a simple way of extract the N first words from a local macro which is comma or space+comma separated in Stata?
给定一个包含由逗号 (",") 或
逗号和 space (", ") 甚至只有 space (" "), 有没有简单的方法来提取第一个 N
这个本地宏的级别(或单词)?
字符串看起来像 "12, 123, 1321, 41"
,或 "12,123,1321,41"
或 "12 123 1321 41"
。
基本上我会对 宏函数 word # of string
的一个版本感到满意
或多或少会像 word 1/N of string
那样工作。 (参见“用于解析的宏函数”
pg 12 in Macro definition and manipulation)
有关更多上下文,我正在处理 levelsof, local() sep()
的输出。所以
我可以选择更容易使用的分隔符。我想要
将生成的级别作为参数传递给 inlist()
函数。下列
通常有效,但 inlist()
最多只需要 250 个参数。这就是为什么我会
喜欢从 levelsof()
的结果中提取 250 个单词的块
sysuse auto, clear
levelsof mpg if trunk > 20, local(levels) sep(", ")
list if inlist(mpg, `levels')
到目前为止的“解决方案”
我想出了一个不太简单的方法来实现它,但它看起来不太好而且
我想知道是否有一种简单的内置方法可以做到这一点。
sysuse auto, clear
levelsof mpg if trunk > 20, local(levels) sep(", ")
scalar number_of_words = 3
forvalues i = 1 (1) `=number_of_words' {
local word_i = `i'
local this_level : word `word_i' of `levels'
local list_of_levels = "`list_of_levels'`this_level'"
di as text "loop: `i'"
di as text "this level: `this_level'"
di as text "list of levels so far: `list_of_levels'"
}
di "`list_of_levels'"
// trim trailing comma
local trimmed_list_of_levels = substr( "`list_of_levels'" , 1 , strlen( "`list_of_levels'" )-1)
di "`trimmed_list_of_levels'"
list make mpg price trunk if inlist(mpg, `trimmed_list_of_levels')
输出
. sysuse auto, clear
(1978 Automobile Data)
.
. levelsof mpg if trunk > 20, local(levels) sep(", ")
12, 15, 17, 18
. scalar number_of_words = 3
. forvalues i = 1 (1) `=number_of_words' {
2. local word_i = `i'
3. local this_level : word `word_i' of `levels'
4. local list_of_levels = "`list_of_levels'`this_level'"
5.
. di as text "loop: `i'"
6. di as text "this level: `this_level'"
7. di as text "list of levels so far: `list_of_levels'"
8. }
loop: 1
this level: 12,
list of levels so far: 12,
loop: 2
this level: 15,
list of levels so far: 12,15,
loop: 3
this level: 17,
list of levels so far: 12,15,17,
.
. di "`list_of_levels'"
12,15,17,
.
. // trim trailing comma
. local trimmed_list_of_levels = substr( "`list_of_levels'" , 1 , strlen( "`list_of_levels'" )-1)
.
. di "`trimmed_list_of_levels'"
12,15,17
. list make mpg price trunk if inlist(mpg, `trimmed_list_of_levels')
+------------------------------------------+
| make mpg price trunk |
|------------------------------------------|
2. | AMC Pacer 17 4,749 11 |
5. | Buick Electra 15 7,827 20 |
23. | Dodge St. Regis 17 6,342 21 |
26. | Linc. Continental 12 11,497 22 |
27. | Linc. Mark V 12 13,594 18 |
|------------------------------------------|
31. | Merc. Marquis 15 6,165 23 |
53. | Audi 5000 17 9,690 15 |
74. | Volvo 260 17 11,995 14 |
+------------------------------------------+
与评论相关的编辑。
编辑 01)
例如,以下内容不起作用。它 returns 错误 130 expression too long
.
clear
set obs 1000
gen id = _n
gen x1 = rnormal()
sum *
levelsof id if x1>0, local(levels) sep(", ")
sum * if inlist(id, `levels')
这个构造 (levelsof + inlist) 似乎是必要的例子
clear
set obs 5000
gen id = round(_n/5)
gen x1 = rnormal()
sum *
levelsof id if x1>2, local(levels) sep(", ")
sum * if x1>2 // if threshold is small enough, there will be too many values for inlist()
sum * if inlist(id, `levels')
使用您的附加示例作为基础,您可以使用 egen max
创建一个标志,该标志对整个 id
具有 任何情况 的 1 x1
值高于特定阈值。例如:
clear
set seed 2021
set obs 5000
gen id = round(_n/5)
gen x1 = rnormal()
sum *
levelsof id if x1>2, local(levels) sep(", ")
sum * if x1>2 // if threshold is small enough, there will be too many values for inlist()
sum * if inlist(id, `levels')
//This will do the same thing
gen over_threshold = x1>2
egen id_over_thresh = max(over_threshold), by(id)
sum * if id_over_thresh
给定一个包含由逗号 (",") 或 逗号和 space (", ") 甚至只有 space (" "), 有没有简单的方法来提取第一个 N 这个本地宏的级别(或单词)?
字符串看起来像 "12, 123, 1321, 41"
,或 "12,123,1321,41"
或 "12 123 1321 41"
。
基本上我会对 宏函数 word # of string
的一个版本感到满意
或多或少会像 word 1/N of string
那样工作。 (参见“用于解析的宏函数”
pg 12 in Macro definition and manipulation)
有关更多上下文,我正在处理 levelsof, local() sep()
的输出。所以
我可以选择更容易使用的分隔符。我想要
将生成的级别作为参数传递给 inlist()
函数。下列
通常有效,但 inlist()
最多只需要 250 个参数。这就是为什么我会
喜欢从 levelsof()
sysuse auto, clear
levelsof mpg if trunk > 20, local(levels) sep(", ")
list if inlist(mpg, `levels')
到目前为止的“解决方案”
我想出了一个不太简单的方法来实现它,但它看起来不太好而且 我想知道是否有一种简单的内置方法可以做到这一点。
sysuse auto, clear
levelsof mpg if trunk > 20, local(levels) sep(", ")
scalar number_of_words = 3
forvalues i = 1 (1) `=number_of_words' {
local word_i = `i'
local this_level : word `word_i' of `levels'
local list_of_levels = "`list_of_levels'`this_level'"
di as text "loop: `i'"
di as text "this level: `this_level'"
di as text "list of levels so far: `list_of_levels'"
}
di "`list_of_levels'"
// trim trailing comma
local trimmed_list_of_levels = substr( "`list_of_levels'" , 1 , strlen( "`list_of_levels'" )-1)
di "`trimmed_list_of_levels'"
list make mpg price trunk if inlist(mpg, `trimmed_list_of_levels')
输出
. sysuse auto, clear
(1978 Automobile Data)
.
. levelsof mpg if trunk > 20, local(levels) sep(", ")
12, 15, 17, 18
. scalar number_of_words = 3
. forvalues i = 1 (1) `=number_of_words' {
2. local word_i = `i'
3. local this_level : word `word_i' of `levels'
4. local list_of_levels = "`list_of_levels'`this_level'"
5.
. di as text "loop: `i'"
6. di as text "this level: `this_level'"
7. di as text "list of levels so far: `list_of_levels'"
8. }
loop: 1
this level: 12,
list of levels so far: 12,
loop: 2
this level: 15,
list of levels so far: 12,15,
loop: 3
this level: 17,
list of levels so far: 12,15,17,
.
. di "`list_of_levels'"
12,15,17,
.
. // trim trailing comma
. local trimmed_list_of_levels = substr( "`list_of_levels'" , 1 , strlen( "`list_of_levels'" )-1)
.
. di "`trimmed_list_of_levels'"
12,15,17
. list make mpg price trunk if inlist(mpg, `trimmed_list_of_levels')
+------------------------------------------+
| make mpg price trunk |
|------------------------------------------|
2. | AMC Pacer 17 4,749 11 |
5. | Buick Electra 15 7,827 20 |
23. | Dodge St. Regis 17 6,342 21 |
26. | Linc. Continental 12 11,497 22 |
27. | Linc. Mark V 12 13,594 18 |
|------------------------------------------|
31. | Merc. Marquis 15 6,165 23 |
53. | Audi 5000 17 9,690 15 |
74. | Volvo 260 17 11,995 14 |
+------------------------------------------+
与评论相关的编辑。
编辑 01)
例如,以下内容不起作用。它 returns 错误 130 expression too long
.
clear
set obs 1000
gen id = _n
gen x1 = rnormal()
sum *
levelsof id if x1>0, local(levels) sep(", ")
sum * if inlist(id, `levels')
这个构造 (levelsof + inlist) 似乎是必要的例子
clear
set obs 5000
gen id = round(_n/5)
gen x1 = rnormal()
sum *
levelsof id if x1>2, local(levels) sep(", ")
sum * if x1>2 // if threshold is small enough, there will be too many values for inlist()
sum * if inlist(id, `levels')
使用您的附加示例作为基础,您可以使用 egen max
创建一个标志,该标志对整个 id
具有 任何情况 的 1 x1
值高于特定阈值。例如:
clear
set seed 2021
set obs 5000
gen id = round(_n/5)
gen x1 = rnormal()
sum *
levelsof id if x1>2, local(levels) sep(", ")
sum * if x1>2 // if threshold is small enough, there will be too many values for inlist()
sum * if inlist(id, `levels')
//This will do the same thing
gen over_threshold = x1>2
egen id_over_thresh = max(over_threshold), by(id)
sum * if id_over_thresh