将 R 列表作为宏导入 Stata?

Bring R list into Stata as macro?

我希望 运行 来自 Stata 的 R 中的套索模型,然后将生成的字符列表(子集系数的名称)作为宏(例如,全局)带回 Stata。

目前我知道两个选项:

  1. 我使用 shell:

    从 Stata 保存了一个 dta 文件和 运行 一个 R 脚本
    shell $Rloc --vanilla <"${LOC}/Lasso.R"
    

    这适用于已保存的 dta 文件,并允许我 运行 我希望 运行 的套索模型,但不是交互式的,所以我无法将相关字符列表(带有子集变量的名称)返回 Stata。

  2. I 运行 R 使用 rcall 从 Stata 交互。但是,rcall 不允许我加载足够大的矩阵,即使在最大 Stata 内存下也是如此。我的预测矩阵 Z(被 Lasso 子集化)是 1,000 x 100,但是当我 运行 命令时:

    rcall: X <- st.matrix(Z) 
    

    我收到一条错误消息:

    macro substitution results in line that is too long: The line resulting from substituting macros would be longer than allowed. The maximum allowed length is 645,216 characters, which is calculated on the basis of set maxvar.

有没有什么方法可以从 Stata 交互式 运行 R,它允许大矩阵,这样我就可以将 R 中的字符列表作为宏带回 Stata?

提前致谢。

下面我将尝试将评论合并为一个 - 希望 - 有用的答案。

不幸的是,rcall 似乎不能很好地处理您需要的大型矩阵。我认为最好使用 shell 命令将 R 调用到 运行 您的脚本,并将字符串作为变量保存在 dta 文件中。这需要更多的工作,但它当然是可编程的。

然后您可以将这些变量读入 Stata 并使用内置函数轻松操作它们。例如,您可以将字符串保存在单独的变量或一个变量中,并按照@Dimitriy 的建议使用 levelsof

考虑以下玩具示例:

clear
set obs 5

input str50 string
"this is a string"
"A longer string is this"
"A string that is even longer is this one"
"How many strings do you have?"
end

levelsof string, local(newstr) 
`"A longer string is this"' `"A string that is even longer is this one"' `"How many strings do you have?"' `"this is a string"'

tokenize `"`newstr'"'

forvalues i = 1 / `: word count `newstr'' {
    display "``i''"
}

A longer string is this
A string that is even longer is this one
How many strings do you have?
this is a string

根据我的经验,rcallrsource 这样的程序对于简单的任务很有用。然而,对于更复杂的工作,它们可能会成为真正的麻烦,在这种情况下,我个人只是求助于真实的东西,即直接使用其他软件。

正如@Dimitriy 还指出的那样,现在有一些 社区贡献的 命令可用于 lasso,这些命令可以满足您的需求,因此您不必 fiddle 与 R:

search lasso

5 packages found (Stata Journal and STB listed first)
-----------------------------------------------------

elasticregress from http://fmwww.bc.edu/RePEc/bocode/e
    'ELASTICREGRESS': module to perform elastic net regression, lasso
    regression, ridge regression / elasticregress calculates an elastic
    net-regularized / regression: an estimator of a linear model in which
    larger / parameters are discouraged.  This estimator nests the LASSO / and

lars from http://fmwww.bc.edu/RePEc/bocode/l
    'LARS': module to perform least angle regression / Least Angle Regression
    is a model-building algorithm that / considers parsimony as well as
    prediction accuracy.  This / method is covered in detail by the paper
    Efron, Hastie, Johnstone / and Tibshirani (2004), published in The Annals

lassopack from http://fmwww.bc.edu/RePEc/bocode/l
    'LASSOPACK': module for lasso, square-root lasso, elastic net, ridge,
    adaptive lasso estimation and cross-validation / lassopack is a suite of
    programs for penalized regression / methods suitable for the
    high-dimensional setting where the / number of predictors p may be large

pdslasso from http://fmwww.bc.edu/RePEc/bocode/p
    'PDSLASSO': module for post-selection and post-regularization OLS or IV
    estimation and inference / pdslasso and ivlasso are routines for
    estimating structural / parameters in linear models with many controls
    and/or / instruments. The routines use methods for estimating sparse /

sivreg from http://fmwww.bc.edu/RePEc/bocode/s
    'SIVREG': module to perform adaptive Lasso with some invalid instruments /
    sivreg estimates a linear instrumental variables regression / where some
    of the instruments fail the exclusion restriction / and are thus invalid.
    The LARS algorithm (Efron et al., 2004) is / applied as long as the Hansen