时间序列股票表现分组排名的 R 函数
R function for grouped ranking of stock performance for time-series
我有每月 return 的长格式。对于每个月,我想根据他们的表现将公司分成 10 个相同大小的组(十分位数)。
我的数据样本如下(此处按值升序排列,即每月 return):
print(R4, n=10)
# A tibble: 1,125 x 5
# Groups: year [3]
company year month value mktvalue
<chr> <dbl> <dbl> <dbl> <dbl>
1 STIFEL FINANCIAL 1997 7 0.821 50.3
2 RAYMOND JAMES FINL. 1997 10 0.833 1070.
3 CHARLES SCHWAB 1997 3 0.853 6250.
4 STATE STREET 1997 3 0.863 6178.
5 STIFEL FINANCIAL 1996 7 0.871 31.6
6 FRANKLIN RESOURCES 1997 3 0.872 7459.
7 BERKSHIRE HATHAWAY 'A' 1997 8 0.879 53857.
8 ALLIANCEBERNSTEIN HLDG. UNT. 1997 3 0.879 2257.
9 STATE STREET 1997 8 0.890 8504.
10 MORGAN STANLEY 1996 7 0.891 8764.
# ... with 1,115 more rows
因为我要给每个月的公司分配排名,所以我先过滤了一个月:
R5 <- R4 %>%
filter(year == 1997, month == 12, !is.na(value)) %>%
arrange(value) %>%
mutate(rank = rank(value))
print(R5)
# A tibble: 13 x 6
# Groups: year [1]
company year month value mktvalue rank
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 FRANKLIN RESOURCES 1997 12 0.967 11505. 1
2 STATE STREET 1997 12 0.978 9250. 2
3 JEFFERIES FINANCIAL GROUP 1997 12 0.998 2175. 3
4 BERKSHIRE HATHAWAY 'A' 1997 12 1.02 55184. 4
5 BANK OF NEW YORK MELLON 1997 12 1.08 21152. 5
6 CHARLES SCHWAB 1997 12 1.09 10854. 6
7 MORGAN STANLEY 1997 12 1.09 32712. 7
8 RAYMOND JAMES FINL. 1997 12 1.11 1207. 8
9 MGIC INVESTMENT 1997 12 1.14 7030. 9
10 ALLIANCEBERNSTEIN HLDG. UNT. 1997 12 1.15 3107. 10
11 AFFILIATED MANAGERS 1997 12 1.16 373. 11
12 RADIAN GP. 1997 12 1.16 1284. 12
13 STIFEL FINANCIAL 1997 12 1.17 81.4 13
我试过这个问题的答案来将公司分组:
Grouped ranking in R
请告诉我是否有更聪明的方法来创建等分组。
percent.rank <- function(x) trunc(rank(x)/length(x)*100)
R6 <- within(R5, pr <- percent.rank(rank))
R6$decile <- cut(R6$pr, breaks = c(0,10,20,30,40,50,60,70,80,90,100), labels = c(1:10))
print(R6)
# A tibble: 13 x 8
# Groups: year [1]
company year month value mktvalue rank pr decile
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <fct>
1 FRANKLIN RESOURCES 1997 12 0.967 11505. 1 7 1
2 STATE STREET 1997 12 0.978 9250. 2 15 2
3 JEFFERIES FINANCIAL GROUP 1997 12 0.998 2175. 3 23 3
4 BERKSHIRE HATHAWAY 'A' 1997 12 1.02 55184. 4 30 3
5 BANK OF NEW YORK MELLON 1997 12 1.08 21152. 5 38 4
6 CHARLES SCHWAB 1997 12 1.09 10854. 6 46 5
7 MORGAN STANLEY 1997 12 1.09 32712. 7 53 6
8 RAYMOND JAMES FINL. 1997 12 1.11 1207. 8 61 7
9 MGIC INVESTMENT 1997 12 1.14 7030. 9 69 7
10 ALLIANCEBERNSTEIN HLDG. UNT. 1997 12 1.15 3107. 10 76 8
11 AFFILIATED MANAGERS 1997 12 1.16 373. 11 84 9
12 RADIAN GP. 1997 12 1.16 1284. 12 92 10
13 STIFEL FINANCIAL 1997 12 1.17 81.4 13 100 10
到目前为止,我手动完成了一个月。 如何在不手动过滤的情况下将此应用到我的数据框的每个月和每年?
我的整个数据将从 1996 年 1 月到 2020 年 12 月。groups/deciles 必须大小相等,并根据月度表现(值)排名。有什么循环的方法或者其他聪明的方法吗?
重现我的数据:
R4 <- data.frame(company = c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L",
"A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L"),
year = c(1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996,1996, 1996, 1996,
1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997),
month = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3),
value = c(1.15999568000864, 0.983783783783784, 1.07692307692308, 1, 0.989795918367347,
0.989690721649484, 1, 1.04166666666667, 0.99, 1.05050505050505, 1.07211538461538,
1.01345291479821, 0.942477876106195, 1.06572769953052, 0.986784140969163,
0.879464285714286, 1.11167512690355, 0.922374429223744, 1.15841584158416,
1.18803418803419, 0.973018705035971, 1.10906058132519, 0.92, 1.00724637681159))
我不确定我是否完全按照您要执行的操作进行操作,但根据您的数据,我能够为每个 month
/[=14 创建一个按 value
排序的分组=] 与此组合:
library(dplyr)
R4 %>%
group_by(year, month) %>%
mutate(rank = rank(value)) %>%
arrange(month, year, rank)
这产生:
# A tibble: 60 × 6
# Groups: year, month [24]
company year month value mktvalue rank
<fct> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ALLIANCEBERNSTEIN HLDG. UNT. 1996 1 0.984 1802. 1
2 BERKSHIRE HATHAWAY 'A' 1996 1 0.994 37088. 2
3 BANK OF NEW YORK MELLON 1996 1 1.05 9517. 3
4 ALLIANCEBERNSTEIN HLDG. UNT. 1997 1 1.07 2307. 1
5 BANK OF NEW YORK MELLON 1997 1 1.08 13652. 2
6 BANK OF NEW YORK MELLON 1996 2 1.01 10488. 1
7 ALLIANCEBERNSTEIN HLDG. UNT. 1996 2 1.08 1931. 2
8 BERKSHIRE HATHAWAY 'A' 1996 2 1.11 39584. 3
9 ALLIANCEBERNSTEIN HLDG. UNT. 1997 2 0.987 2441. 1
10 BANK OF NEW YORK MELLON 1997 2 1.06 15457. 2
我有每月 return 的长格式。对于每个月,我想根据他们的表现将公司分成 10 个相同大小的组(十分位数)。
我的数据样本如下(此处按值升序排列,即每月 return):
print(R4, n=10)
# A tibble: 1,125 x 5
# Groups: year [3]
company year month value mktvalue
<chr> <dbl> <dbl> <dbl> <dbl>
1 STIFEL FINANCIAL 1997 7 0.821 50.3
2 RAYMOND JAMES FINL. 1997 10 0.833 1070.
3 CHARLES SCHWAB 1997 3 0.853 6250.
4 STATE STREET 1997 3 0.863 6178.
5 STIFEL FINANCIAL 1996 7 0.871 31.6
6 FRANKLIN RESOURCES 1997 3 0.872 7459.
7 BERKSHIRE HATHAWAY 'A' 1997 8 0.879 53857.
8 ALLIANCEBERNSTEIN HLDG. UNT. 1997 3 0.879 2257.
9 STATE STREET 1997 8 0.890 8504.
10 MORGAN STANLEY 1996 7 0.891 8764.
# ... with 1,115 more rows
因为我要给每个月的公司分配排名,所以我先过滤了一个月:
R5 <- R4 %>%
filter(year == 1997, month == 12, !is.na(value)) %>%
arrange(value) %>%
mutate(rank = rank(value))
print(R5)
# A tibble: 13 x 6
# Groups: year [1]
company year month value mktvalue rank
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 FRANKLIN RESOURCES 1997 12 0.967 11505. 1
2 STATE STREET 1997 12 0.978 9250. 2
3 JEFFERIES FINANCIAL GROUP 1997 12 0.998 2175. 3
4 BERKSHIRE HATHAWAY 'A' 1997 12 1.02 55184. 4
5 BANK OF NEW YORK MELLON 1997 12 1.08 21152. 5
6 CHARLES SCHWAB 1997 12 1.09 10854. 6
7 MORGAN STANLEY 1997 12 1.09 32712. 7
8 RAYMOND JAMES FINL. 1997 12 1.11 1207. 8
9 MGIC INVESTMENT 1997 12 1.14 7030. 9
10 ALLIANCEBERNSTEIN HLDG. UNT. 1997 12 1.15 3107. 10
11 AFFILIATED MANAGERS 1997 12 1.16 373. 11
12 RADIAN GP. 1997 12 1.16 1284. 12
13 STIFEL FINANCIAL 1997 12 1.17 81.4 13
我试过这个问题的答案来将公司分组: Grouped ranking in R
请告诉我是否有更聪明的方法来创建等分组。
percent.rank <- function(x) trunc(rank(x)/length(x)*100)
R6 <- within(R5, pr <- percent.rank(rank))
R6$decile <- cut(R6$pr, breaks = c(0,10,20,30,40,50,60,70,80,90,100), labels = c(1:10))
print(R6)
# A tibble: 13 x 8
# Groups: year [1]
company year month value mktvalue rank pr decile
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <fct>
1 FRANKLIN RESOURCES 1997 12 0.967 11505. 1 7 1
2 STATE STREET 1997 12 0.978 9250. 2 15 2
3 JEFFERIES FINANCIAL GROUP 1997 12 0.998 2175. 3 23 3
4 BERKSHIRE HATHAWAY 'A' 1997 12 1.02 55184. 4 30 3
5 BANK OF NEW YORK MELLON 1997 12 1.08 21152. 5 38 4
6 CHARLES SCHWAB 1997 12 1.09 10854. 6 46 5
7 MORGAN STANLEY 1997 12 1.09 32712. 7 53 6
8 RAYMOND JAMES FINL. 1997 12 1.11 1207. 8 61 7
9 MGIC INVESTMENT 1997 12 1.14 7030. 9 69 7
10 ALLIANCEBERNSTEIN HLDG. UNT. 1997 12 1.15 3107. 10 76 8
11 AFFILIATED MANAGERS 1997 12 1.16 373. 11 84 9
12 RADIAN GP. 1997 12 1.16 1284. 12 92 10
13 STIFEL FINANCIAL 1997 12 1.17 81.4 13 100 10
到目前为止,我手动完成了一个月。 如何在不手动过滤的情况下将此应用到我的数据框的每个月和每年? 我的整个数据将从 1996 年 1 月到 2020 年 12 月。groups/deciles 必须大小相等,并根据月度表现(值)排名。有什么循环的方法或者其他聪明的方法吗?
重现我的数据:
R4 <- data.frame(company = c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L",
"A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L"),
year = c(1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996,1996, 1996, 1996,
1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997),
month = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3),
value = c(1.15999568000864, 0.983783783783784, 1.07692307692308, 1, 0.989795918367347,
0.989690721649484, 1, 1.04166666666667, 0.99, 1.05050505050505, 1.07211538461538,
1.01345291479821, 0.942477876106195, 1.06572769953052, 0.986784140969163,
0.879464285714286, 1.11167512690355, 0.922374429223744, 1.15841584158416,
1.18803418803419, 0.973018705035971, 1.10906058132519, 0.92, 1.00724637681159))
我不确定我是否完全按照您要执行的操作进行操作,但根据您的数据,我能够为每个 month
/[=14 创建一个按 value
排序的分组=] 与此组合:
library(dplyr)
R4 %>%
group_by(year, month) %>%
mutate(rank = rank(value)) %>%
arrange(month, year, rank)
这产生:
# A tibble: 60 × 6
# Groups: year, month [24]
company year month value mktvalue rank
<fct> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ALLIANCEBERNSTEIN HLDG. UNT. 1996 1 0.984 1802. 1
2 BERKSHIRE HATHAWAY 'A' 1996 1 0.994 37088. 2
3 BANK OF NEW YORK MELLON 1996 1 1.05 9517. 3
4 ALLIANCEBERNSTEIN HLDG. UNT. 1997 1 1.07 2307. 1
5 BANK OF NEW YORK MELLON 1997 1 1.08 13652. 2
6 BANK OF NEW YORK MELLON 1996 2 1.01 10488. 1
7 ALLIANCEBERNSTEIN HLDG. UNT. 1996 2 1.08 1931. 2
8 BERKSHIRE HATHAWAY 'A' 1996 2 1.11 39584. 3
9 ALLIANCEBERNSTEIN HLDG. UNT. 1997 2 0.987 2441. 1
10 BANK OF NEW YORK MELLON 1997 2 1.06 15457. 2