滚动window估计协方差矩阵
Rolling window to estimate co-variance matrix
我有一个 4 年的资产时间序列 returns,我正在尝试执行滚动 window 以估计校准周期为 6 的方差-协方差矩阵个月。总的来说,我应该获得 40 个协方差矩阵。
我试过 运行 下面写的代码,但它是错误的。
如何修改此 R 代码?
data
window.size <- 180 #set the size of the window equal to 6 months
windows <- embed(1:nrow(data), window.size)
forApproach <- function(data, windows) {
l <- vector(mode="list", length=nrow(windows))
for (i in 1:nrow(data)) {
l[[i]] <- cov(data[windows[i, ], ])
}
}
将包含 20 天内 5 种资产的 returns 的矩阵视为数据集
data <- matrix(rnorm(100), 20, 5) #data represents the returns of 5 assets over 20 days
我想在 5 天内校准 returns 的协方差矩阵,因此考虑第 1、2、3、4、5 天。然后我想校准另一个协方差矩阵,同时考虑第 6 天, 7, 8, 9, 10。依此类推,使用滚动 window(我已经尝试使用循环 for)。
window.size <- 5
但是将 windows 大小设置为 5,对于第一个矩阵,代码考虑第 1、2、3、4、5 天,但对于第二个矩阵,代码考虑第 2、3 天, 4, 5, 6(不是我想要的 6, 7, 8, 9, 10)。这是我的问题。我不知道如何修改代码以便 "split" 从第 2 天到第 6 天。
我觉得大家对这个词有误解"rolling windows";通常,滚动 window 方法是指在 "roll along" 逐行计算某个 window 内跨行的某些指标。因此,在您的 5 天 window 情况下,每一行对应一天,行 1,2,3,4,5
后跟行 2,3,4,5,6
,然后是行 3,4,5,6,7
,依此类推。
如果我对你的理解正确,你反而想计算 非重叠 数据行块的协方差矩阵。
给定示例数据,您可以这样做:
# Sample data
set.seed(2017);
df <- matrix(rnorm(100), 20, 5)
# Split into groups of 5 corresponding to 5 days and calculate
# covariance matrix
idx <- rep(1:(nrow(df) / 5), each = 5)
lapply(split(as.data.frame(df), idx), cov)
#$`1`
# V1 V2 V3 V4 V5
#V1 1.42311854 1.12594509 -0.01635956 -0.02680876 -0.9996623
#V2 1.12594509 1.91104181 0.01600511 -0.50270431 -0.4910714
#V3 -0.01635956 0.01600511 0.21584984 0.04264861 0.5356313
#V4 -0.02680876 -0.50270431 0.04264861 0.80241761 -0.3501894
#V5 -0.99966230 -0.49107141 0.53563126 -0.35018940 2.2617564
#
#$`2`
# V1 V2 V3 V4 V5
#V1 1.6361650 0.28858744 0.55629684 -0.10309928 -0.56784302
#V2 0.2885874 0.32030225 0.09751046 -0.03968577 0.10521384
#V3 0.5562968 0.09751046 0.21460406 0.06921578 -0.20474838
#V4 -0.1030993 -0.03968577 0.06921578 0.44061198 -0.02624344
#V5 -0.5678430 0.10521384 -0.20474838 -0.02624344 0.35858727
#
#$`3`
# V1 V2 V3 V4 V5
#V1 1.32188749 -0.2504449 0.02865553 -0.83709045 0.7402660
#V2 -0.25044493 0.4449060 -0.45165482 0.18724720 -0.1684300
#V3 0.02865553 -0.4516548 1.59804827 -0.05257944 -0.2588460
#V4 -0.83709045 0.1872472 -0.05257944 2.08276888 0.1345800
#V5 0.74026604 -0.1684300 -0.25884602 0.13457998 0.7381084
#
#$`4`
# V1 V2 V3 V4 V5
#V1 1.3825793 1.8348434 0.1367480 0.7553666 0.1722815
#V2 1.8348434 3.0679884 -0.7141430 1.9419513 0.4139003
#V3 0.1367480 -0.7141430 1.3646673 -1.3689109 -0.3962832
#V4 0.7553666 1.9419513 -1.3689109 2.1242897 0.7087351
#V5 0.1722815 0.4139003 -0.3962832 0.7087351 0.4589429
更新
要解决您评论中的情况,有一种可能性:
# Calculate rows by which to calculate the covariance matrix.
idx <- lapply(seq(1, nrow(df) - 5, by = 3), function(i) seq(i, i + 4));
idx;
#[[1]]
#[1] 1 2 3 4 5
#
#[[2]]
#[1] 4 5 6 7 8
#
#[[3]]
#[1] 7 8 9 10 11
#
#[[4]]
#[1] 10 11 12 13 14
#
#[[5]]
#[1] 13 14 15 16 17
# Calculate covariance matrix
lapply(idx, function(i) cov(df[i, ]))
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1.42311854 1.12594509 -0.01635956 -0.02680876 -0.9996623
[2,] 1.12594509 1.91104181 0.01600511 -0.50270431 -0.4910714
[3,] -0.01635956 0.01600511 0.21584984 0.04264861 0.5356313
[4,] -0.02680876 -0.50270431 0.04264861 0.80241761 -0.3501894
[5,] -0.99966230 -0.49107141 0.53563126 -0.35018940 2.2617564
[[2]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1.2276633 0.8120994 0.68757421 0.43389428 -0.2034626
[2,] 0.8120994 0.9467878 0.54971586 0.32442138 0.0417013
[3,] 0.6875742 0.5497159 0.81237637 0.04317779 0.1016797
[4,] 0.4338943 0.3244214 0.04317779 0.28202885 -0.1328829
[5,] -0.2034626 0.0417013 0.10167967 -0.13288293 0.1941425
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1.611594316 0.20860309 0.57449605 -0.009977472 -0.5998735
[2,] 0.208603088 0.37400181 0.02228603 -0.184461638 0.1758137
[3,] 0.574496047 0.02228603 0.25869591 0.192013428 -0.2558926
[4,] -0.009977472 -0.18446164 0.19201343 0.726477219 -0.1542378
[5,] -0.599873476 0.17581368 -0.25589263 -0.154237772 0.4141949
[[4]]
[,1] [,2] [,3] [,4] [,5]
[1,] 0.959758045 -0.002570837 -0.2490718 -0.11965574 0.7669619
[2,] -0.002570837 0.413593056 -0.2238722 -0.05783551 -0.1231235
[3,] -0.249071754 -0.223872167 1.3953139 0.56463838 -0.2210563
[4,] -0.119655741 -0.057835506 0.5646384 1.09879770 0.1947360
[5,] 0.766961857 -0.123123489 -0.2210563 0.19473603 0.7653579
[[5]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1.02217247 0.8925311 -0.01480308 0.4282321 0.5941764
[2,] 0.89253109 2.8366577 -1.20242470 2.7991809 0.6818609
[3,] -0.01480308 -1.2024247 1.48751111 -1.7348326 -0.1196483
[4,] 0.42823208 2.7991809 -1.73483255 3.8382883 1.0043009
[5,] 0.59417636 0.6818609 -0.11964826 1.0043009 0.8229246
我有一个 4 年的资产时间序列 returns,我正在尝试执行滚动 window 以估计校准周期为 6 的方差-协方差矩阵个月。总的来说,我应该获得 40 个协方差矩阵。 我试过 运行 下面写的代码,但它是错误的。 如何修改此 R 代码?
data
window.size <- 180 #set the size of the window equal to 6 months
windows <- embed(1:nrow(data), window.size)
forApproach <- function(data, windows) {
l <- vector(mode="list", length=nrow(windows))
for (i in 1:nrow(data)) {
l[[i]] <- cov(data[windows[i, ], ])
}
}
将包含 20 天内 5 种资产的 returns 的矩阵视为数据集
data <- matrix(rnorm(100), 20, 5) #data represents the returns of 5 assets over 20 days
我想在 5 天内校准 returns 的协方差矩阵,因此考虑第 1、2、3、4、5 天。然后我想校准另一个协方差矩阵,同时考虑第 6 天, 7, 8, 9, 10。依此类推,使用滚动 window(我已经尝试使用循环 for)。
window.size <- 5
但是将 windows 大小设置为 5,对于第一个矩阵,代码考虑第 1、2、3、4、5 天,但对于第二个矩阵,代码考虑第 2、3 天, 4, 5, 6(不是我想要的 6, 7, 8, 9, 10)。这是我的问题。我不知道如何修改代码以便 "split" 从第 2 天到第 6 天。
我觉得大家对这个词有误解"rolling windows";通常,滚动 window 方法是指在 "roll along" 逐行计算某个 window 内跨行的某些指标。因此,在您的 5 天 window 情况下,每一行对应一天,行 1,2,3,4,5
后跟行 2,3,4,5,6
,然后是行 3,4,5,6,7
,依此类推。
如果我对你的理解正确,你反而想计算 非重叠 数据行块的协方差矩阵。
给定示例数据,您可以这样做:
# Sample data
set.seed(2017);
df <- matrix(rnorm(100), 20, 5)
# Split into groups of 5 corresponding to 5 days and calculate
# covariance matrix
idx <- rep(1:(nrow(df) / 5), each = 5)
lapply(split(as.data.frame(df), idx), cov)
#$`1`
# V1 V2 V3 V4 V5
#V1 1.42311854 1.12594509 -0.01635956 -0.02680876 -0.9996623
#V2 1.12594509 1.91104181 0.01600511 -0.50270431 -0.4910714
#V3 -0.01635956 0.01600511 0.21584984 0.04264861 0.5356313
#V4 -0.02680876 -0.50270431 0.04264861 0.80241761 -0.3501894
#V5 -0.99966230 -0.49107141 0.53563126 -0.35018940 2.2617564
#
#$`2`
# V1 V2 V3 V4 V5
#V1 1.6361650 0.28858744 0.55629684 -0.10309928 -0.56784302
#V2 0.2885874 0.32030225 0.09751046 -0.03968577 0.10521384
#V3 0.5562968 0.09751046 0.21460406 0.06921578 -0.20474838
#V4 -0.1030993 -0.03968577 0.06921578 0.44061198 -0.02624344
#V5 -0.5678430 0.10521384 -0.20474838 -0.02624344 0.35858727
#
#$`3`
# V1 V2 V3 V4 V5
#V1 1.32188749 -0.2504449 0.02865553 -0.83709045 0.7402660
#V2 -0.25044493 0.4449060 -0.45165482 0.18724720 -0.1684300
#V3 0.02865553 -0.4516548 1.59804827 -0.05257944 -0.2588460
#V4 -0.83709045 0.1872472 -0.05257944 2.08276888 0.1345800
#V5 0.74026604 -0.1684300 -0.25884602 0.13457998 0.7381084
#
#$`4`
# V1 V2 V3 V4 V5
#V1 1.3825793 1.8348434 0.1367480 0.7553666 0.1722815
#V2 1.8348434 3.0679884 -0.7141430 1.9419513 0.4139003
#V3 0.1367480 -0.7141430 1.3646673 -1.3689109 -0.3962832
#V4 0.7553666 1.9419513 -1.3689109 2.1242897 0.7087351
#V5 0.1722815 0.4139003 -0.3962832 0.7087351 0.4589429
更新
要解决您评论中的情况,有一种可能性:
# Calculate rows by which to calculate the covariance matrix.
idx <- lapply(seq(1, nrow(df) - 5, by = 3), function(i) seq(i, i + 4));
idx;
#[[1]]
#[1] 1 2 3 4 5
#
#[[2]]
#[1] 4 5 6 7 8
#
#[[3]]
#[1] 7 8 9 10 11
#
#[[4]]
#[1] 10 11 12 13 14
#
#[[5]]
#[1] 13 14 15 16 17
# Calculate covariance matrix
lapply(idx, function(i) cov(df[i, ]))
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1.42311854 1.12594509 -0.01635956 -0.02680876 -0.9996623
[2,] 1.12594509 1.91104181 0.01600511 -0.50270431 -0.4910714
[3,] -0.01635956 0.01600511 0.21584984 0.04264861 0.5356313
[4,] -0.02680876 -0.50270431 0.04264861 0.80241761 -0.3501894
[5,] -0.99966230 -0.49107141 0.53563126 -0.35018940 2.2617564
[[2]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1.2276633 0.8120994 0.68757421 0.43389428 -0.2034626
[2,] 0.8120994 0.9467878 0.54971586 0.32442138 0.0417013
[3,] 0.6875742 0.5497159 0.81237637 0.04317779 0.1016797
[4,] 0.4338943 0.3244214 0.04317779 0.28202885 -0.1328829
[5,] -0.2034626 0.0417013 0.10167967 -0.13288293 0.1941425
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1.611594316 0.20860309 0.57449605 -0.009977472 -0.5998735
[2,] 0.208603088 0.37400181 0.02228603 -0.184461638 0.1758137
[3,] 0.574496047 0.02228603 0.25869591 0.192013428 -0.2558926
[4,] -0.009977472 -0.18446164 0.19201343 0.726477219 -0.1542378
[5,] -0.599873476 0.17581368 -0.25589263 -0.154237772 0.4141949
[[4]]
[,1] [,2] [,3] [,4] [,5]
[1,] 0.959758045 -0.002570837 -0.2490718 -0.11965574 0.7669619
[2,] -0.002570837 0.413593056 -0.2238722 -0.05783551 -0.1231235
[3,] -0.249071754 -0.223872167 1.3953139 0.56463838 -0.2210563
[4,] -0.119655741 -0.057835506 0.5646384 1.09879770 0.1947360
[5,] 0.766961857 -0.123123489 -0.2210563 0.19473603 0.7653579
[[5]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1.02217247 0.8925311 -0.01480308 0.4282321 0.5941764
[2,] 0.89253109 2.8366577 -1.20242470 2.7991809 0.6818609
[3,] -0.01480308 -1.2024247 1.48751111 -1.7348326 -0.1196483
[4,] 0.42823208 2.7991809 -1.73483255 3.8382883 1.0043009
[5,] 0.59417636 0.6818609 -0.11964826 1.0043009 0.8229246