lm() 数据帧和 for 循环

lm() datframes and for-loop

我有数据框 DFMyDataBase :

DATE          AUDCAD      AUDCHF       AUDJPY     AUDNZD      (...)
05/01/2017  0.965960    0.742230    85.315000   1.048500      (...)
08/01/2017  0.971760    0.746410    85.353000   1.048140      (...)
09/01/2017  0.975070    0.749300    85.307000   1.054290      (...)
10/01/2017  0.980720    0.754540    85.873000   1.054380      (...)
11/01/2017  0.983750    0.756540    85.861000   1.053650      (...)
12/01/2017  0.983320    0.756070    85.822000   1.051750      (...)
(...)   

和数据框DFLM

FirstSymbol     SecondSymbol    PValue     DickeyFullerCV
     AUDCAD           AUDCHF
     AUDCAD           AUDJPY
     AUDCAD           AUDNZD
     AUDCAD           AUDUSD
      (...)            (...)

根据存储在 DFLM 中的成对名称构建 lm(),它们代表 DFMyDataBase 中的列名称。 DFLM第一列代表因变量,第二列代表自变量。

听起来您希望根据存储在 DFLM 中的成对名称构建公式,这些名称代表 DFMyDataBase 中的列名称,然后使用这些公式作为 [=38] 的基础=] lm 每对。我进一步猜测您希望 DFLM 的第一列代表因变量,第二列代表自变量。

既然如此,你可以做类似的事情

f_list <- apply(DFLM, 1, function(x) as.formula(paste(x[1], "~", x[2])))

models <- lapply(f_list, function(x) {
  eval(call("lm", formula = x, data = quote(DFMyDataBase)))
  })

现在 models 是一个包含 lm 个对象的列表,DFLM 的每一行对应一个对象,您可以:

models[[1]]
#> 
#> Call:
#> lm(formula = AUDCAD ~ AUDCHF, data = DFMyDataBase)
#> 
#> Coefficients:
#> (Intercept)       AUDCHF  
#>     0.06269      1.21739

summary(models[[3]])
#> 
#> Call:
#> lm(formula = AUDCHF ~ AUDNZD, data = DFMyDataBase)
#> 
#> Residuals:
#>          1          2          3          4          5          6 
#> -0.0037091  0.0010088 -0.0052919 -0.0001864  0.0029046  0.0052740 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)  
#> (Intercept)  -0.8210     0.7342  -1.118    0.326  
#> AUDNZD        1.4944     0.6980   2.141    0.099 .
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.004446 on 4 degrees of freedom
#> Multiple R-squared:  0.534,  Adjusted R-squared:  0.4175 
#> F-statistic: 4.583 on 1 and 4 DF,  p-value: 0.09899

等等。

请注意,在您的示例数据框中,您的第四行两次包含相同的列名 - 不清楚您打算如何应对这种情况。我已经稍微改变了输入,如下所示。


数据

DFMyDataBase <- structure(list(DATE = c("05/01/2017", "08/01/2017",
"09/01/2017", "10/01/2017", "11/01/2017", "12/01/2017"), AUDCAD = c(0.96596, 
0.97176, 0.97507, 0.98072, 0.98375, 0.98332), AUDCHF = c(0.74223, 
0.74641, 0.7493, 0.75454, 0.75654, 0.75607), AUDJPY = c(85.315, 
85.353, 85.307, 85.873, 85.861, 85.822), AUDNZD = c(1.0485, 1.04814, 
1.05429, 1.05438, 1.05365, 1.05175)), class = "data.frame", row.names = c(NA, 
-6L))

DFLM <- structure(list(FirstSymbol = c("AUDCAD", "AUDCAD", "AUDCHF"), 
    SecondSymbol = c("AUDCHF", "AUDJPY", "AUDNZD")), row.names = c(NA, 
3L), class = "data.frame")

DFMyDataBase
#>         DATE  AUDCAD  AUDCHF AUDJPY  AUDNZD
#> 1 05/01/2017 0.96596 0.74223 85.315 1.04850
#> 2 08/01/2017 0.97176 0.74641 85.353 1.04814
#> 3 09/01/2017 0.97507 0.74930 85.307 1.05429
#> 4 10/01/2017 0.98072 0.75454 85.873 1.05438
#> 5 11/01/2017 0.98375 0.75654 85.861 1.05365
#> 6 12/01/2017 0.98332 0.75607 85.822 1.05175

DFLM
#>   FirstSymbol SecondSymbol
#> 1      AUDCAD       AUDCHF
#> 2      AUDCAD       AUDJPY
#> 3      AUDCHF       AUDNZD