使用 R 中的标准执行滚动平均值

Performing a rolling average with criteria in R

一直在尝试先学习最基本的项目,然后再扩展复杂性。因此,对于这个,我将如何修改最后一行,以便为每个系列代码创建 12 个月的滚动平均值。在这种情况下,它会为系列代码 100 产生平均 8,为系列代码 101 产生 27。

首先是样本数据

Monthx<- c(201911,201912,20201
         ,20202,20203,20204,20205,20206,20207
         ,20208,20209,202010,202011,201911,201912,20201
         ,20202,20203,20204,20205,20206,20207
         ,20208,20209,202010,202011)

empx <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,21,22,23,24,25,26,27,28,29,20,31,32,33)

seriescode<-c(100,100,100,100,100,100,100,100,100,100,100,100,100,110,110,110,110,110,110,110,110,110,110,110,110,110)

ces12x <- data.frame(Monthx,empx,seriescode)

操纵

library(dplyr)

ces12x<- ces12x %>% mutate(year = substr(as.numeric(Monthx),1,4),
                           month = substr(as.numeric(Monthx),5,7),
                           date = as.Date(paste(year,month,"1",sep ="-")))
                           Month_ord <- order(Monthx)

ces12x<-ces12x %>% mutate(ravg = zoo::rollmeanr(empx, 12, fill = NA))

如果您想为此继续使用 tidyverse,请执行以下操作:

library(dplyr)

ces12x %>%
  group_by(seriescode) %>%
  arrange(date) %>%
  slice(tail(row_number(), 12)) %>%
  summarize(ravg = mean(empx))

您只需要添加一个 group_by(seriescode),然后它会根据系列代码执行变异函数:

Monthx<- c(201911,201912,20201
           ,20202,20203,20204,20205,20206,20207
           ,20208,20209,202010,202011,201911,201912,20201
           ,20202,20203,20204,20205,20206,20207
           ,20208,20209,202010,202011)

empx <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,21,22,23,24,25,26,27,28,29,20,31,32,33)

seriescode<-c(100,100,100,100,100,100,100,100,100,100,100,100,100,110,110,110,110,110,110,110,110,110,110,110,110,110)

ces12x <- data.frame(Monthx,empx,seriescode)
ces12x<- ces12x %>% mutate(year = substr(as.numeric(Monthx),1,4),
                           month = substr(as.numeric(Monthx),5,7),
                           date = as.Date(paste(year,month,"1",sep ="-")))
Month_ord <- order(Monthx)

ces12x<-ces12x %>% group_by(seriescode) %>% mutate(ravg = zoo::rollmeanr(empx, 12, fill = NA)) # add the group_by(seriescode)

这会产生输出:

# A tibble: 26 x 7
# Groups:   seriescode [2]
   Monthx  empx seriescode year  month date        ravg
    <dbl> <dbl>      <dbl> <chr> <chr> <date>     <dbl>
 1 201911     1        100 2019  11    2019-11-01  NA  
 2 201912     2        100 2019  12    2019-12-01  NA  
 3  20201     3        100 2020  1     2020-01-01  NA  
 4  20202     4        100 2020  2     2020-02-01  NA  
 5  20203     5        100 2020  3     2020-03-01  NA  
 6  20204     6        100 2020  4     2020-04-01  NA  
 7  20205     7        100 2020  5     2020-05-01  NA  
 8  20206     8        100 2020  6     2020-06-01  NA  
 9  20207     9        100 2020  7     2020-07-01  NA  
10  20208    10        100 2020  8     2020-08-01  NA  
11  20209    11        100 2020  9     2020-09-01  NA  
12 202010    12        100 2020  10    2020-10-01   6.5
13 202011    13        100 2020  11    2020-11-01   7.5
14 201911    21        110 2019  11    2019-11-01  NA  
15 201912    22        110 2019  12    2019-12-01  NA  
16  20201    23        110 2020  1     2020-01-01  NA  
17  20202    24        110 2020  2     2020-02-01  NA  
18  20203    25        110 2020  3     2020-03-01  NA  
19  20204    26        110 2020  4     2020-04-01  NA  
20  20205    27        110 2020  5     2020-05-01  NA  
21  20206    28        110 2020  6     2020-06-01  NA  
22  20207    29        110 2020  7     2020-07-01  NA  
23  20208    20        110 2020  8     2020-08-01  NA  
24  20209    31        110 2020  9     2020-09-01  NA  
25 202010    32        110 2020  10    2020-10-01  25.7
26 202011    33        110 2020  11    2020-11-01  26.7