如何对年份进行分类?

How to categorise years with cut?

我正在尝试将 1928 年到 2022 年放入 R 中的几十年类别中, 因为我想计算每个十年

的股市平均值 return
SP500 %>% mutate(decade = cut(SP500$Year, seq(1930,2020,by=10))) %>% 
  group_by(decade) %>% summarise(return = mean(`Annual\n% Change`))
# A tibble: 10 × 2
   decade              return
   <fct>                <dbl>
 1 (1.93e+03,1.94e+03]  0.014
 2 (1.94e+03,1.95e+03]  0.077
 3 (1.95e+03,1.96e+03]  0.124
 4 (1.96e+03,1.97e+03]  0.056
 5 (1.97e+03,1.98e+03]  0.058
 6 (1.98e+03,1.99e+03]  0.098
 7 (1.99e+03,2e+03]     0.157
 8 (2e+03,2.01e+03]     0.018
 9 (2.01e+03,2.02e+03]  0.121
10 NA                   0.04 

我怎样才能将十年因素的标签更改为 1930-1940、1940-1950 之类的东西....

非常感谢

您正在使用的函数的 labels= 参数接受一个 length(breaks)-1 向量供您覆盖标签。

years <- seq(1920, 2020, by = 10)
length(years); years
# [1] 11
#  [1] 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020
labels <- paste(years[-length(years)], years[-1], sep = "-")
length(labels); labels
# [1] 10
#  [1] "1920-1930" "1930-1940" "1940-1950" "1950-1960" "1960-1970" "1970-1980" "1980-1990" "1990-2000" "2000-2010"
# [10] "2010-2020"
cut(c(1935, 1959, 1960), years, labels = labels)
# [1] 1930-1940 1950-1960 1950-1960
# Levels: 1920-1930 1930-1940 1940-1950 1950-1960 1960-1970 1970-1980 1980-1990 1990-2000 2000-2010 2010-2020

或者为了更准确地使用范围命名法,我们可以这样做

labels <- paste(1 + years[-length(years)], years[-1], sep = "-")
cut(c(1935, 1959, 1960), years, labels = labels)
# [1] 1931-1940 1951-1960 1951-1960
# Levels: 1921-1930 1931-1940 1941-1950 1951-1960 1961-1970 1971-1980 1981-1990 1991-2000 2001-2010 2011-2020