在给定时间段内计算 returns - R

Calculating returns over given periods - R

我目前正在尝试计算不同时间范围(1、5、20、50、200、250 天)内的股票 returns,但我还找不到方便的解决方案。据我所知,Quantmod 仅提供预设 returns。

因此,我使用了一个解决方案 ,我修改了它以获得 returns,但与以下函数没有区别:

sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), c(-n,0)))}

我现在的问题是我不能在 xts 系列中使用它,因为结果显然计算了两个值:一个具有预期的 n-lag,另一个具有分母中的当前值。有趣的是,只有正确的值显示在数据框中。所以我计算如下:

#Calculate returns
cCDAX$R1 = sret(cCDAX$Close, 1)
cCDAX$R5 = sret(cCDAX$Close, 5)

这给了我以下值:

    Date        Close   Volume      R1              R5
1   2010-01-04  523.96  137055000   NA              NA
2   2010-01-05  523.64  168916800   -0.0006107336   NA
3   2010-01-06  524.33  145659600   0.0013176992    NA
4   2010-01-07  523.83  182195400   -0.0009535979   NA
5   2010-01-08  525.55  214804700   0.0032835080    NA
6   2010-01-11  525.93  189962700   0.0007230520    3.759829e-03
7   2010-01-12  517.59  191580300   -0.0158576236   -1.155374e-02
8   2010-01-13  519.71  185076700   0.0040959060    -8.811245e-03
9   2010-01-14  522.48  167065200   0.0053298955    -2.577172e-03
10  2010-01-15  513.14  208268000   -0.0178762823   -2.361336e-02
11  2010-01-18  516.37  112098400   0.0062945785    -1.817732e-02
12  2010-01-19  520.56  159323200   0.0081143366    5.738132e-03
13  2010-01-20  510.21  167641400   -0.0198824343   -1.827943e-02
14  2010-01-21  501.77  190062800   -0.0165422081   -3.963788e-02
15  2010-01-22  496.67  240544400   -0.0101640194   -3.209650e-02
16  2010-01-25  491.91  199198900   -0.0095838283   -4.736913e-02
17  2010-01-26  494.76  188213100   0.0057937428    -4.956201e-02
18  2010-01-27  492.25  193048200   -0.0050731668   -3.520119e-02
19  2010-01-28  484.26  229885500   -0.0162315896   -3.489647e-02
20  2010-01-29  489.82  252945300   0.0114814356    -1.379185e-02

当我直接在控制台中输入公式时,每天的returns如下所示:

          lag-1          lag0
1            NA            NA
2 -0.0006107336 -0.0006111069
3  0.0013176992  0.0013159651
4 -0.0009535979 -0.0009545081
5  0.0032835080  0.0032727619

显然,由于这些值(即使不是这样出现)实际上有两个值,所以我无法在之后将它们转换为 xts 对象。没有 xts 对象,我无法 运行 我的时间序列分析。问题肯定是分母,但我需要 c(-n, 0) 才能得到正确的计算。我尝试了多种方式,比如

sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), c(-n)))}
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), n))}
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(x, c(-n,0)))}
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(x, n))}

并没有真正奏效,所以上面的版本(也在下面)是唯一提供正确值的版本,但是之后无法处理......有没有人有解决方案来抑制或延迟0 滞后?

sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), c(-n,0)))}

在任何操作之前,带有 dput(head(cCDAX, 20)) 的 cCDAX 的输出如下所示:

> dput(head(cCDAX, 20))
structure(list(Date = structure(c(1262559600, 1262646000, 1262732400, 
1262818800, 1262905200, 1263164400, 1263250800, 1263337200, 1263423600, 
1263510000, 1263769200, 1263855600, 1263942000, 1264028400, 1264114800, 
1264374000, 1264460400, 1264546800, 1264633200, 1264719600), class = c("POSIXct", 
"POSIXt"), tzone = ""), Close = c(523.96, 523.64, 524.33, 523.83, 
525.55, 525.93, 517.59, 519.71, 522.48, 513.14, 516.37, 520.56, 
510.21, 501.77, 496.67, 491.91, 494.76, 492.25, 484.26, 489.82
), Volume = c(137055000L, 168916800L, 145659600L, 182195400L, 
214804700L, 189962700L, 191580300L, 185076700L, 167065200L, 208268000L, 
112098400L, 159323200L, 167641400L, 190062800L, 240544400L, 199198900L, 
188213100L, 193048200L, 229885500L, 252945300L)), row.names = c(NA, 
20L), class = "data.frame")

我目前运行宁的过程如下(我省略了不同的滞后):

library(vars)
library(fpp2)
library(nortest)
library(ggpubr)
library(xts)
library(highfrequency)
library(quantmod)
library(pracma)
library(zoo)

# clear all
rm(list=ls())

#Moving Average Function
mav = function(x,n){filter(x, rep(1/n,n), sides = 1)}
#Standard Deviation Function
vari = function(x,n){rollapply(x, width = n, FUN = sd, fill = NA, align = c("right"))}
#Return function
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), c(-n,0)))}

#Loading data, transfering the Date column in an actual date
cCDAX = read.csv("./CDAX_Clean.csv", header=TRUE, sep=",", dec=".")
cCDAX$Date = as.POSIXct(cCDAX$Date, format = "%d.%m.%Y")

#Adding moving averages for the closing prices
cCDAX$MA5C = mav(cCDAX[,"Close"], 5)
#Calculate standard deviations for the closing prices
cCDAX$SD5C = vari(cCDAX $Close, 5)
#Calculate returns
cCDAX$R1 = sret(cCDAX$Close, 1)
cCDAX$R5 = sret(cCDAX$Close, 5)
#Calculate standard deviation of returns
cCDAX$SD5R = vari(cCDAX $R1, 5)
#Adding moving averages for the daily volumes
cCDAX$MA5V = mav(cCDAX[,"Volume"], 5)
#Calculate standard deviations for the closing prices
cCDAX$SD5V = vari(cCDAX$Volume, 5)
#Calculate change in daily volume
cCDAX$VC1 = sret(cCDAX$Volume, 1)
cCDAX$VC5 = sret(cCDAX$Volume, 5)
#Calculate standard deviation of volume change
cCDAX$SD5VC = vari(cCDAX $VC1, 5)

#Creating a time series; with omitted variables should be [,2:13] instead of [,2:45]
CDAX_ts = as.xts(cCDAX[,2:45], order.by = cCDAX[,1]) 

我可能遗漏了一些东西,但下面的代码是问题要求的吗?它为 class "xts" 的对象定义了一个泛型 sret 和一个方法。 class "zoo"(但 xts)对象的方法可以用相同的方式定义。那么调用函数就可以了

library(xts)
library(quantmod)

sret <- function(x, ...) UseMethod("sret", x)
sret.default <- function(x, n = 1){
  m <- length(x)
  y <- rep(NA_real_, m)
  y[(n + 1):m] <- x[seq_len(m - n)]
  (x - y)/y
}
sret.data.frame <- function(x, n){
  i <- sapply(x, is.numeric)
  x[i] <- lapply(x[i], sret.default, n = n)
  x
}
sret.xts <- function(x, n = 1){
  y <- lag(x, n, na.pad = TRUE)
  (x - y)/y
}

getSymbols("AAPL", from = Sys.Date() - 20)
sret(AAPL, 5)
sret(AAPL$AAPL.Open, 2)
#              AAPL.Open
#2021-03-15           NA
#2021-03-16           NA
#2021-03-17  0.021281733
#2021-03-18 -0.022949219
#2021-03-19 -0.034612185
#2021-03-22 -0.021191681
#2021-03-23  0.027811562
#2021-03-24  0.020273555
#2021-03-25 -0.031704877
#2021-03-26 -0.020523490
#2021-03-29  0.017344850
#2021-03-30 -0.001998143
#2021-03-31  0.000000000
#2021-04-01  0.028707770

1) diff.zoodiff.xts 有一个默认为 TRUE 的 arithmetic= 参数,但如果为 FALSE,则采用比率而不是差值.它也可以同时用于所有列。

library(quantmod)
getSymbols("FB")

k <- 2

diff(Ad(FB), k, arith = FALSE) - 1   # returns over k days of Adjusted Close

diff(FB, k, arith = FALSE) - 1   # returns over k days for each column

2) 这也有效并且仅使用基数 R。如果不需要填充,请在示例中省略 NA 部分。 k 是需要 return 的天数。

xx <- c(5, 3, 6, 2, 1)
k <- 2
c(rep(NA, k), exp(diff(log(xx), 2)) - 1) 

或一次在多个列上。 BOD是R自带的数据框

rbind(matrix(NA, k, ncol(BOD)), exp(diff(log(as.matrix(BOD)), k)) - 1)