计算 data.table 中的每月平均 returns,每个月的股票数量不同
Calculate average monthly returns in data.table with differing number of stocks in each month
假设我有一个 data.table,priceDT,每天观察 return 多个股票,如下所示:
> priceDT
Date Return Share
1: 2011-01-03 0.04500000 GAI
2: 2011-01-03 -0.02100000 KDV
3: 2011-01-04 0.03300000 GAI
4: 2011-01-04 0.01770000 KDV
5: 2011-01-05 -0.01742000 GAI
6: 2011-01-05 0.07900000 KDV
7: 2011-02-06 0.02400000 GAI
8: 2011-02-06 -0.02110000 KDV
9: 2011-02-07 -0.04300000 AFT
10: 2011-02-07 0.01199700 AIP
11: 2011-02-07 0.00551810 ARH
12: 2011-02-07 0.07451101 BIK
13: 2011-02-07 -0.03495597 BLU
14: 2011-02-07 -0.06062462 CGR
15: 2011-02-07 -0.03660000 GAI
16: 2011-02-07 -0.01240000 KDV
我想计算给定月份所有股票的月均值 return。所以在 2011 年 1 月,两股的平均 return。由于份额列,我们知道它只有两股。第一步是获取当月每个份额的平均值 return。然后获取当月股票组合的平均值 return。所以一月份,GAI的平均值是0.02019333,KDV的平均值是0.02523333。因此,该月的平均值为:0.02019333
这就是投资组合的逻辑return。我想在 data.table 的剩余月份中重复
对于我的样本数据,我想要这样的结果:
portfolio
Date avg_return
1: 2011-01 0.02271333
2: 2011-02 -0.008700561
数据:
priceDT <- fread(text = "Date, Return, Share
2011-01-03,0.045,GAI
2011-01-03,-0.021,KDV
2011-01-04,0.033,GAI
2011-01-04,0.0177,KDV
2011-01-05,-0.01742,GAI
2011-01-05,0.079,KDV
2011-02-06,0.024,GAI
2011-02-06,-0.0211,KDV
2011-02-07,-0.043,AFT
2011-02-07,0.011997,AIP
2011-02-07,0.0055181,ARH
2011-02-07,0.074511006,BIK
2011-02-07,-0.034955973,BLU
2011-02-07,-0.060624622,CGR
2011-02-07,-0.0366,GAI
2011-02-07,-0.0124,KDV
")
portfolio <- fread(text = "Date, avg_return
2011-01,0.022713333
2011-02,-0.01194431
")
我不明白你是如何让步骤 t 获得所有共享的每月平均值 return....
但也许这会让您入门?
#make dates
priceDT[, Date := as.Date( Date ) ]
# step 1: mean by share by month
priceDT[, .(avg_return = mean( Return, na.rm = TRUE) ),
by = .( month = format(Date, "%Y-%m"), Share ) ]
但是从这里开始,我看不到到达 portfolio
的逻辑...
这是另一种方法,但我的结果与你的不匹配。
您可以创建一个“year-month”列来对结果进行分组。按照您的步骤,您可以计算每个月的平均份额(对于每个份额),我们称之为 ShareMean
。
然后,您可以计算给定月份所有份额的这些均值的平均值,我们称之为 MonthMean
。
这是你的想法吗?
library(data.table)
priceDT[, YearMonth := list(substr(Date, 1, 7))]
priceDT[, .(ShareMean = mean(Return)), by = c("YearMonth", "Share")][
, .(MonthMean = mean(ShareMean)), by = "YearMonth"]
输出
YearMonth MonthMean
1: 2011-01 0.022713333
2: 2011-02 -0.008700561
你可以直接计算每月return,我会这样做:
library(tidyverse)
library(lubridate)
priceDT %>%
mutate(month = month.abb[month(Date)]) %>%
group_by(month) %>%
summarise(avg_return = mean(Return))
(month.abb[month(Date)]
表示月份缩写,例如 Jan、Feb)
或首先计算给定月份的份额平均值:
priceDT %>%
mutate(month = month.abb[month(Date)]) %>%
group_by(month,Share) %>%
summarise(avg_return = mean(Return))
然后你可以像上面那样计算月平均return。
priceDT[, mean(Return), by = .(ym = format(Date, "%Y-%m"), Share)
][, mean(V1), by = ym]
# ym V1
# 1: 2011-01 0.022713333
# 2: 2011-02 -0.008700561
假设我有一个 data.table,priceDT,每天观察 return 多个股票,如下所示:
> priceDT
Date Return Share
1: 2011-01-03 0.04500000 GAI
2: 2011-01-03 -0.02100000 KDV
3: 2011-01-04 0.03300000 GAI
4: 2011-01-04 0.01770000 KDV
5: 2011-01-05 -0.01742000 GAI
6: 2011-01-05 0.07900000 KDV
7: 2011-02-06 0.02400000 GAI
8: 2011-02-06 -0.02110000 KDV
9: 2011-02-07 -0.04300000 AFT
10: 2011-02-07 0.01199700 AIP
11: 2011-02-07 0.00551810 ARH
12: 2011-02-07 0.07451101 BIK
13: 2011-02-07 -0.03495597 BLU
14: 2011-02-07 -0.06062462 CGR
15: 2011-02-07 -0.03660000 GAI
16: 2011-02-07 -0.01240000 KDV
我想计算给定月份所有股票的月均值 return。所以在 2011 年 1 月,两股的平均 return。由于份额列,我们知道它只有两股。第一步是获取当月每个份额的平均值 return。然后获取当月股票组合的平均值 return。所以一月份,GAI的平均值是0.02019333,KDV的平均值是0.02523333。因此,该月的平均值为:0.02019333
这就是投资组合的逻辑return。我想在 data.table 的剩余月份中重复
对于我的样本数据,我想要这样的结果:
portfolio
Date avg_return
1: 2011-01 0.02271333
2: 2011-02 -0.008700561
数据:
priceDT <- fread(text = "Date, Return, Share
2011-01-03,0.045,GAI
2011-01-03,-0.021,KDV
2011-01-04,0.033,GAI
2011-01-04,0.0177,KDV
2011-01-05,-0.01742,GAI
2011-01-05,0.079,KDV
2011-02-06,0.024,GAI
2011-02-06,-0.0211,KDV
2011-02-07,-0.043,AFT
2011-02-07,0.011997,AIP
2011-02-07,0.0055181,ARH
2011-02-07,0.074511006,BIK
2011-02-07,-0.034955973,BLU
2011-02-07,-0.060624622,CGR
2011-02-07,-0.0366,GAI
2011-02-07,-0.0124,KDV
")
portfolio <- fread(text = "Date, avg_return
2011-01,0.022713333
2011-02,-0.01194431
")
我不明白你是如何让步骤 t 获得所有共享的每月平均值 return.... 但也许这会让您入门?
#make dates
priceDT[, Date := as.Date( Date ) ]
# step 1: mean by share by month
priceDT[, .(avg_return = mean( Return, na.rm = TRUE) ),
by = .( month = format(Date, "%Y-%m"), Share ) ]
但是从这里开始,我看不到到达 portfolio
的逻辑...
这是另一种方法,但我的结果与你的不匹配。
您可以创建一个“year-month”列来对结果进行分组。按照您的步骤,您可以计算每个月的平均份额(对于每个份额),我们称之为 ShareMean
。
然后,您可以计算给定月份所有份额的这些均值的平均值,我们称之为 MonthMean
。
这是你的想法吗?
library(data.table)
priceDT[, YearMonth := list(substr(Date, 1, 7))]
priceDT[, .(ShareMean = mean(Return)), by = c("YearMonth", "Share")][
, .(MonthMean = mean(ShareMean)), by = "YearMonth"]
输出
YearMonth MonthMean
1: 2011-01 0.022713333
2: 2011-02 -0.008700561
你可以直接计算每月return,我会这样做:
library(tidyverse)
library(lubridate)
priceDT %>%
mutate(month = month.abb[month(Date)]) %>%
group_by(month) %>%
summarise(avg_return = mean(Return))
(month.abb[month(Date)]
表示月份缩写,例如 Jan、Feb)
或首先计算给定月份的份额平均值:
priceDT %>%
mutate(month = month.abb[month(Date)]) %>%
group_by(month,Share) %>%
summarise(avg_return = mean(Return))
然后你可以像上面那样计算月平均return。
priceDT[, mean(Return), by = .(ym = format(Date, "%Y-%m"), Share)
][, mean(V1), by = ym]
# ym V1
# 1: 2011-01 0.022713333
# 2: 2011-02 -0.008700561