lm() 数据帧和 for 循环
lm() datframes and for-loop
我有数据框 DFMyDataBase
:
DATE AUDCAD AUDCHF AUDJPY AUDNZD (...)
05/01/2017 0.965960 0.742230 85.315000 1.048500 (...)
08/01/2017 0.971760 0.746410 85.353000 1.048140 (...)
09/01/2017 0.975070 0.749300 85.307000 1.054290 (...)
10/01/2017 0.980720 0.754540 85.873000 1.054380 (...)
11/01/2017 0.983750 0.756540 85.861000 1.053650 (...)
12/01/2017 0.983320 0.756070 85.822000 1.051750 (...)
(...)
和数据框DFLM
:
FirstSymbol SecondSymbol PValue DickeyFullerCV
AUDCAD AUDCHF
AUDCAD AUDJPY
AUDCAD AUDNZD
AUDCAD AUDUSD
(...) (...)
根据存储在 DFLM
中的成对名称构建 lm(),它们代表 DFMyDataBase
中的列名称。 DFLM
第一列代表因变量,第二列代表自变量。
听起来您希望根据存储在 DFLM
中的成对名称构建公式,这些名称代表 DFMyDataBase
中的列名称,然后使用这些公式作为 [=38] 的基础=] lm
每对。我进一步猜测您希望 DFLM
的第一列代表因变量,第二列代表自变量。
既然如此,你可以做类似的事情
f_list <- apply(DFLM, 1, function(x) as.formula(paste(x[1], "~", x[2])))
models <- lapply(f_list, function(x) {
eval(call("lm", formula = x, data = quote(DFMyDataBase)))
})
现在 models
是一个包含 lm
个对象的列表,DFLM
的每一行对应一个对象,您可以:
models[[1]]
#>
#> Call:
#> lm(formula = AUDCAD ~ AUDCHF, data = DFMyDataBase)
#>
#> Coefficients:
#> (Intercept) AUDCHF
#> 0.06269 1.21739
或
summary(models[[3]])
#>
#> Call:
#> lm(formula = AUDCHF ~ AUDNZD, data = DFMyDataBase)
#>
#> Residuals:
#> 1 2 3 4 5 6
#> -0.0037091 0.0010088 -0.0052919 -0.0001864 0.0029046 0.0052740
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.8210 0.7342 -1.118 0.326
#> AUDNZD 1.4944 0.6980 2.141 0.099 .
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.004446 on 4 degrees of freedom
#> Multiple R-squared: 0.534, Adjusted R-squared: 0.4175
#> F-statistic: 4.583 on 1 and 4 DF, p-value: 0.09899
等等。
请注意,在您的示例数据框中,您的第四行两次包含相同的列名 - 不清楚您打算如何应对这种情况。我已经稍微改变了输入,如下所示。
数据
DFMyDataBase <- structure(list(DATE = c("05/01/2017", "08/01/2017",
"09/01/2017", "10/01/2017", "11/01/2017", "12/01/2017"), AUDCAD = c(0.96596,
0.97176, 0.97507, 0.98072, 0.98375, 0.98332), AUDCHF = c(0.74223,
0.74641, 0.7493, 0.75454, 0.75654, 0.75607), AUDJPY = c(85.315,
85.353, 85.307, 85.873, 85.861, 85.822), AUDNZD = c(1.0485, 1.04814,
1.05429, 1.05438, 1.05365, 1.05175)), class = "data.frame", row.names = c(NA,
-6L))
DFLM <- structure(list(FirstSymbol = c("AUDCAD", "AUDCAD", "AUDCHF"),
SecondSymbol = c("AUDCHF", "AUDJPY", "AUDNZD")), row.names = c(NA,
3L), class = "data.frame")
DFMyDataBase
#> DATE AUDCAD AUDCHF AUDJPY AUDNZD
#> 1 05/01/2017 0.96596 0.74223 85.315 1.04850
#> 2 08/01/2017 0.97176 0.74641 85.353 1.04814
#> 3 09/01/2017 0.97507 0.74930 85.307 1.05429
#> 4 10/01/2017 0.98072 0.75454 85.873 1.05438
#> 5 11/01/2017 0.98375 0.75654 85.861 1.05365
#> 6 12/01/2017 0.98332 0.75607 85.822 1.05175
DFLM
#> FirstSymbol SecondSymbol
#> 1 AUDCAD AUDCHF
#> 2 AUDCAD AUDJPY
#> 3 AUDCHF AUDNZD
我有数据框 DFMyDataBase
:
DATE AUDCAD AUDCHF AUDJPY AUDNZD (...)
05/01/2017 0.965960 0.742230 85.315000 1.048500 (...)
08/01/2017 0.971760 0.746410 85.353000 1.048140 (...)
09/01/2017 0.975070 0.749300 85.307000 1.054290 (...)
10/01/2017 0.980720 0.754540 85.873000 1.054380 (...)
11/01/2017 0.983750 0.756540 85.861000 1.053650 (...)
12/01/2017 0.983320 0.756070 85.822000 1.051750 (...)
(...)
和数据框DFLM
:
FirstSymbol SecondSymbol PValue DickeyFullerCV
AUDCAD AUDCHF
AUDCAD AUDJPY
AUDCAD AUDNZD
AUDCAD AUDUSD
(...) (...)
根据存储在 DFLM
中的成对名称构建 lm(),它们代表 DFMyDataBase
中的列名称。 DFLM
第一列代表因变量,第二列代表自变量。
听起来您希望根据存储在 DFLM
中的成对名称构建公式,这些名称代表 DFMyDataBase
中的列名称,然后使用这些公式作为 [=38] 的基础=] lm
每对。我进一步猜测您希望 DFLM
的第一列代表因变量,第二列代表自变量。
既然如此,你可以做类似的事情
f_list <- apply(DFLM, 1, function(x) as.formula(paste(x[1], "~", x[2])))
models <- lapply(f_list, function(x) {
eval(call("lm", formula = x, data = quote(DFMyDataBase)))
})
现在 models
是一个包含 lm
个对象的列表,DFLM
的每一行对应一个对象,您可以:
models[[1]]
#>
#> Call:
#> lm(formula = AUDCAD ~ AUDCHF, data = DFMyDataBase)
#>
#> Coefficients:
#> (Intercept) AUDCHF
#> 0.06269 1.21739
或
summary(models[[3]])
#>
#> Call:
#> lm(formula = AUDCHF ~ AUDNZD, data = DFMyDataBase)
#>
#> Residuals:
#> 1 2 3 4 5 6
#> -0.0037091 0.0010088 -0.0052919 -0.0001864 0.0029046 0.0052740
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.8210 0.7342 -1.118 0.326
#> AUDNZD 1.4944 0.6980 2.141 0.099 .
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.004446 on 4 degrees of freedom
#> Multiple R-squared: 0.534, Adjusted R-squared: 0.4175
#> F-statistic: 4.583 on 1 and 4 DF, p-value: 0.09899
等等。
请注意,在您的示例数据框中,您的第四行两次包含相同的列名 - 不清楚您打算如何应对这种情况。我已经稍微改变了输入,如下所示。
数据
DFMyDataBase <- structure(list(DATE = c("05/01/2017", "08/01/2017",
"09/01/2017", "10/01/2017", "11/01/2017", "12/01/2017"), AUDCAD = c(0.96596,
0.97176, 0.97507, 0.98072, 0.98375, 0.98332), AUDCHF = c(0.74223,
0.74641, 0.7493, 0.75454, 0.75654, 0.75607), AUDJPY = c(85.315,
85.353, 85.307, 85.873, 85.861, 85.822), AUDNZD = c(1.0485, 1.04814,
1.05429, 1.05438, 1.05365, 1.05175)), class = "data.frame", row.names = c(NA,
-6L))
DFLM <- structure(list(FirstSymbol = c("AUDCAD", "AUDCAD", "AUDCHF"),
SecondSymbol = c("AUDCHF", "AUDJPY", "AUDNZD")), row.names = c(NA,
3L), class = "data.frame")
DFMyDataBase
#> DATE AUDCAD AUDCHF AUDJPY AUDNZD
#> 1 05/01/2017 0.96596 0.74223 85.315 1.04850
#> 2 08/01/2017 0.97176 0.74641 85.353 1.04814
#> 3 09/01/2017 0.97507 0.74930 85.307 1.05429
#> 4 10/01/2017 0.98072 0.75454 85.873 1.05438
#> 5 11/01/2017 0.98375 0.75654 85.861 1.05365
#> 6 12/01/2017 0.98332 0.75607 85.822 1.05175
DFLM
#> FirstSymbol SecondSymbol
#> 1 AUDCAD AUDCHF
#> 2 AUDCAD AUDJPY
#> 3 AUDCHF AUDNZD