R 中 returns 时间序列的 Ewma Returns

Ewma Returns for a Time Series of returns in R

为我的时间序列的每一列计算 EWMA returns 的最佳方法是什么?在所有列上方,我们有从今天 - 260 天(-1 年)到今天 -1 天的 returns。

**returns 是按天数除以收盘价计算的。

我正在使用以下功能:

ewma.func <- function(rets, lambda) {
    sig.p <- 0
    sig.s <- vapply(rets, function(r) sig.p <<- sig.p*lambda + (r^2)*(1 - lambda), 0)
    return(sqrt(sig.s))
}

但它每次只能生成1列的Ewma,所以我还必须执行以下操作:

ewma_col = NULL

    for (w in 1:ncol(df)){
    ewma_col[[w]] = ewma.func(df[,w], lambda = 0.94)
    }
    
    df2 <- do.call(rbind, ewma_col) %>% t()
    colnames(df2) = colnames(df)

因为我有来自这个特定对象的 5 列,并且在我正在使用的列表中有更多其他 100 个类似的对象,所以必须为每个对象的每一列计算 Ewma 变得非常困难和低效。所以我在想是否有更简单的方法。

我的采样 df:

structure(list(`25079578000106` = c(0.311405806132825, 0.0261260831393884, 
0.126801611077099, -0.201990496952931, -0.169037712385034, -0.372023939507926, 
0.426906935535953, -0.402262040825008, -0.273008284875687, 0.142923301064002, 
0.0522466965776403, 0.491128923749784, 0.547432459279662, -0.00905547394722817, 
0.243408062669914, -0.565142654522788, -0.0284871479379945, 0.141976900522423, 
-0.115634388475883, 0.0858369759953348, 0.252102295598888, -0.130994651044603, 
0.213179273123387, 0, 0.254748840234242, -0.162688137697842, 
0.0670642675686395, 0.409574624973175, 0.11580733826122, 0.152815408000606, 
-0.194192341950838, 0.079688931509736, 0.0390181277907686, 0.0366672406016733, 
-0.0841513321574894, 0.170703395997407, -0.1032803445014, 0.301935098286776, 
0.12983982123842, 0.179888841921638, -0.04270641511539, 0.194911670405418, 
-0.126730582360324, 0.348033349109755, 0.0962079717282904, 0.0734806822947576, 
0.151055897003971, -0.0701511527950061, -0.161361593563925, 0.246798639636836
), `21144577000147` = c(0.402627056610072, 0.0670045021252008, 
0.136672287590045, -0.257998532470083, -0.126993350295379, -0.57979580369647, 
0.493768537307915, -0.491292521383002, -0.403311319223576, 0.130267872918921, 
-0.0309827290038811, 0.617972996951721, 0.486863606965926, 0.0791557540651411, 
0.221599948054063, -0.743017289278214, -0.122417766579019, 0.199045961198863, 
-0.204796549951425, 0.0958513541263528, 0.221446985779039, -0.103656955252518, 
0.242373424043762, 0, 0.320491723141458, -0.187373789685807, 
0.073898113987525, 0.443193321189028, 0.131922088075953, 0.167945069370035, 
-0.218753095850843, 0.0856883381857187, 0.0915706430532737, 0.0253365722528542, 
-0.10242040234516, 0.210685512865894, -0.111825193016557, 0.343926238201675, 
0.145042635631398, 0.169826889032265, -0.085688805575046, 0.256890913078678, 
-0.173078901207191, 0.502885210153181, 0.0139494439281407, 0.111911786007113, 
0.124141056221561, -0.1009381527183, -0.164678661440121, 0.270671359612606
), `19107923000175` = c(1.17081038442848, -1.53767897591024, 
-0.511278352678346, 0.801980435971927, -1.1354756311448, 1.33550018854294, 
-0.877121115991031, 0.893385693962045, -3.05784205729651, 0.790948188478069, 
-0.874211667633062, -2.0517918994301, 0.108547761010414, 1.31951493240194, 
0.59011726098106, 0.751824284998293, -2.542040795106, -1.30722252988562, 
0.166101507966232, -0.333577277251607, -0.48391700402135, -0.287302340893802, 
0.276978237343428, 0, -2.20114477424431, 2.28636453339277, 3.21842714220111, 
0.591201915267447, 1.88892838687025, -2.4835963874466, 0.93808037963754, 
-2.02373054462441, 1.10818007306079, 0.963590860919794, 0.221162120942608, 
0.927865234370984, 1.30669520840456, -1.5475129142942, 1.44346553624928, 
-1.33299447861646, 2.56613694509724, -0.854390492077073, 0.431278918404132, 
-0.419447091917391, 0.437028634769376, -0.279096110807586, 0.702864309823781, 
1.8092529326168, -1.76575759915067, 1.79323091451806), `25079578000106` = c(-1.46258859240334, 
-1.08758898677479, -0.0989607635347056, 0.877778709582344, -1.81190225830505, 
3.65239411476068, -0.591252178764989, 1.21883593492385, 2.9361510378294, 
-1.21526156190157, 5.60858230674057, -0.483417673513031, 3.11737542488117, 
-0.928573480450723, -0.855911339203885, 1.42741011768521, 2.48564664470905, 
3.64030099535739, -0.0133031404402573, 1.84565666459093, 3.33521612974437, 
-0.706821796120494, -1.41998375802359, 0, -1.00702592444577, 
-0.764259576953918, 0.504494091364904, 2.34908743768756, 1.12513038984616, 
0.883990707916382, -0.23625019375686, 0.794114018390246, -1.84599011799946, 
1.00693676176888, -2.68018999058768, 2.1352680909331, -0.361733150930377, 
1.57261038511933, 0.0516994778081425, 1.29365618286101, 1.84691599060898, 
-0.271832695671037, 1.894436301518, 0.0966644805885153, 1.10278020638361, 
-1.48991306559765, 0.533713807271852, 0.703722278376517, -0.931114916329534, 
2.53580948592571), `19436835000117` = c(1.49069022500044, -1.29966042904925, 
-0.395616604691895, 0.727380076932604, -0.30439719239439, 0.550924036724609, 
-0.846017086223583, 0.841084288731508, -2.71310085681762, 0.432345969238668, 
-1.42297721340583, -1.75329706107732, -0.234704765443894, 1.02912636612018, 
0.953879318876716, 0.506016590225045, -2.46852979989853, -1.29307204251745, 
-0.361195165078243, 0.142310472620011, -0.545533438798884, 0.0622563582510338, 
0.664697968204564, 0, -2.65178033678239, 1.65225289310911, 2.5845850508631, 
0.743106457593967, 1.91502897378086, -2.12601029097641, 0.531378326195409, 
-1.64881667524241, 0.658820966236817, 0.782823536428623, -0.430202234929311, 
0.941061544290278, 1.38020377507928, -1.04732682539179, 0.918463659036206, 
-0.891537194911507, 2.72019066906068, -0.480601724302687, 0.65309472320223, 
0.334795709022728, 0.0443713630374987, -0.747195361418562, 0.921720304359042, 
1.04346702937619, -1.57727738560425, 1.28708233642101)), row.names = c("Retorno D - 260", 
"Retorno D - 259", "Retorno D - 258", "Retorno D - 257", "Retorno D - 256", 
"Retorno D - 255", "Retorno D - 254", "Retorno D - 253", "Retorno D - 252", 
"Retorno D - 251", "Retorno D - 250", "Retorno D - 249", "Retorno D - 248", 
"Retorno D - 247", "Retorno D - 246", "Retorno D - 245", "Retorno D - 244", 
"Retorno D - 243", "Retorno D - 242", "Retorno D - 241", "Retorno D - 240", 
"Retorno D - 239", "Retorno D - 238", "Retorno D - 237", "Retorno D - 236", 
"Retorno D - 235", "Retorno D - 234", "Retorno D - 233", "Retorno D - 232", 
"Retorno D - 231", "Retorno D - 230", "Retorno D - 229", "Retorno D - 228", 
"Retorno D - 227", "Retorno D - 226", "Retorno D - 225", "Retorno D - 224", 
"Retorno D - 223", "Retorno D - 222", "Retorno D - 221", "Retorno D - 220", 
"Retorno D - 219", "Retorno D - 218", "Retorno D - 217", "Retorno D - 216", 
"Retorno D - 215", "Retorno D - 214", "Retorno D - 213", "Retorno D - 212", 
"Retorno D - 211"), class = "data.frame")

您可以使用 purrr 包中的 map_df 在一行中完成。

library(dplyr)
library(purrr)

ewma_col <- map_df(df1, ewma.func, lambda = 0.94)
ewma_col
# A tibble: 50 x 5
   `25079578000106` `21144577000147` `19107923000175` `32666326000149` `19436835000117`
              <dbl>            <dbl>            <dbl>            <dbl>            <dbl>
 1           0.0763           0.0986            0.287            0.358            0.365
 2           0.0742           0.0970            0.468            0.438            0.476
 3           0.0784           0.0998            0.471            0.425            0.472
 4           0.0907           0.116             0.497            0.465            0.491
 5           0.0972           0.116             0.556            0.633            0.482
 6           0.131            0.181             0.631            1.08             0.486
 7           0.165            0.213             0.648            1.06             0.515
 8           0.188            0.239             0.666            1.07             0.540
 9           0.194            0.252             0.989            1.26             0.846
10           0.191            0.247             0.978            1.26             0.827

Calc EWMA 的函数式编程解决方案

避免忽略局部作用域的赋值 <<- 是可取的。重构或 copy/pasting 代码时可能会错过这个小任务。

来源示例数据和加载需要的库

将上面的 dput 输出复制粘贴到一个文本文件中,这样我们就可以获取该文件并将值 data.frame 存储在一个变量 (df) 中。

DPUT_TEXT_FILE <- '/tmp/example_dput.txt'
df <- source(DPUT_TEXT_FILE)$value

我们将使用 purrr 动态创建函数,dplyr 将习惯于将函数应用于数据框的每一列。

library(dplyr)
library(purrr)

内计算

问题正文中最里面的计算归结为以下内容。 这个(现在)回避的原始实现的一个重要方面是这个函数的结果被反馈到后续调用中。

# Calculate single statistic
single_ewa <- function(sig_p, val, lambda){ 
  sig_p*lambda + (val^2)*(1 - lambda)
}

解决使用部分应用函数和累加器函数的问题。

# Calculate full weighted moving average
calc_ewma <- function(vals, lambda){
  # Partially apply calc single stat function to set lambda
  part_ewa <- partial(single_ewa, lambda=lambda)

  # Reduce and accumulate to get the raw moving results
  raw_result <- Reduce(f=part_ewa, x=vals, init=0, accumulate = TRUE )

  # Square root finishes the calculation
  result <- sqrt(raw_result)

  # Finally, drop the initial condition from the accumulation
  result <- result[2:length(result)]
  return(result)
}

部分应用函数

使用 purrr::partial 我们可以设置 lambda 并返回一个不再需要我们将其作为参数传递的函数。这也称为柯里化函数。它获得了一个具有我们正在寻找的空气(参数数量)的函数。即,累加值sig_p和新值val.

累加器函数,Reduce

我们使用 Reduce 从初始值 0 开始,将输入向量中的值传递给它,运行 处理这两个的函数,最后累积结果下一次迭代。对于输入向量的每个成员,这将继续一次。使用 accumulate=TRUE 会产生一个累加值向量,而不仅仅是最终值。

最后,破解列名并完成工作

暂时将列名换成枚举名称可避免 dplyr 函数抛出有关非唯一列名的错误。

# Save old colnames as hack around duplicate column names
old_colnames <- colnames(df)
colnames(df) <- as.character(1:ncol(df))

# Do calculation
ewma_df <- df %>% mutate_all(.funs=calc_ewma, lambda=0.94)

# re-assign colnames
colnames(df) <- old_colnames
colnames(ewma_df) <- old_colnames