按组查找滚动周期内的计数或频率
Find the count or frequency over rolling periods by group
这是我的数据:
Person<-c("Bob","Bob","Bob","Bob","Mary","Mary","Mary","Mary","Sue","Sue","Sue","Sue")
Year<-c(1980,1981,1982,1983,1980,1981,1982,1983,1980,1981,1982,1983)
Attendance<-c(1,1,0,0,1,1,1,0,0,0,1,0)
data<-as.data.frame(cbind(Person,Year,Attendance))
data
Person Year Attendance
Bob 1980 1
Bob 1981 1
Bob 1982 0
Bob 1983 0
Mary 1980 1
Mary 1981 1
Mary 1982 1
Mary 1983 0
Sue 1980 0
Sue 1981 0
Sue 1982 1
Sue 1983 0
我想使用 tidyverse/dplyr 创建一个 table 来显示每个人在连续两年期间的出勤率。像这样:
Person Two_year_period Frequency_attendance
Bob 1980-1981 100%
Bob 1981-1982 50%
Bob 1982-1983 0%
Mary 1980-1981 100%
Mary 1981-1982 100%
Mary 1982-1983 50%
Sue 1980-1981 0%
Sue 1981-1982 50%
Sue 1982-1983 50%
这是一个使用 dplyr 的解决方案:
data_final <- data %>%
group_by(Person) %>%
mutate(Two_year_period = paste(Year,lead(as.numeric(Year)), sep = "-")) %>% # Create new column with lead year name
mutate(Attendance2 = lead(as.numeric(Attendance))) %>% # Create new column with lead attendance value
drop_na(Attendance2) %>% # Drop NA
mutate(Frequency_attendance = paste((as.double(Attendance) + as.double(Attendance2))*(100/2), "%")) %>% # Calculate Frequency_attendance
select(-one_of(c('Year', 'Attendance', 'Attendance2'))) #Drop auxiliary columns
data_final
Person Two_year_period Frequency_attendance
<chr> <chr> <chr>
1 Bob 1980-1981 100 %
2 Bob 1981-1982 50 %
3 Bob 1982-1983 0 %
4 Mary 1980-1981 100 %
5 Mary 1981-1982 100 %
6 Mary 1982-1983 50 %
7 Sue 1980-1981 0 %
8 Sue 1981-1982 50 %
9 Sue 1982-1983 50 %
这是我的数据:
Person<-c("Bob","Bob","Bob","Bob","Mary","Mary","Mary","Mary","Sue","Sue","Sue","Sue")
Year<-c(1980,1981,1982,1983,1980,1981,1982,1983,1980,1981,1982,1983)
Attendance<-c(1,1,0,0,1,1,1,0,0,0,1,0)
data<-as.data.frame(cbind(Person,Year,Attendance))
data
Person Year Attendance
Bob 1980 1
Bob 1981 1
Bob 1982 0
Bob 1983 0
Mary 1980 1
Mary 1981 1
Mary 1982 1
Mary 1983 0
Sue 1980 0
Sue 1981 0
Sue 1982 1
Sue 1983 0
我想使用 tidyverse/dplyr 创建一个 table 来显示每个人在连续两年期间的出勤率。像这样:
Person Two_year_period Frequency_attendance
Bob 1980-1981 100%
Bob 1981-1982 50%
Bob 1982-1983 0%
Mary 1980-1981 100%
Mary 1981-1982 100%
Mary 1982-1983 50%
Sue 1980-1981 0%
Sue 1981-1982 50%
Sue 1982-1983 50%
这是一个使用 dplyr 的解决方案:
data_final <- data %>%
group_by(Person) %>%
mutate(Two_year_period = paste(Year,lead(as.numeric(Year)), sep = "-")) %>% # Create new column with lead year name
mutate(Attendance2 = lead(as.numeric(Attendance))) %>% # Create new column with lead attendance value
drop_na(Attendance2) %>% # Drop NA
mutate(Frequency_attendance = paste((as.double(Attendance) + as.double(Attendance2))*(100/2), "%")) %>% # Calculate Frequency_attendance
select(-one_of(c('Year', 'Attendance', 'Attendance2'))) #Drop auxiliary columns
data_final
Person Two_year_period Frequency_attendance
<chr> <chr> <chr>
1 Bob 1980-1981 100 %
2 Bob 1981-1982 50 %
3 Bob 1982-1983 0 %
4 Mary 1980-1981 100 %
5 Mary 1981-1982 100 %
6 Mary 1982-1983 50 %
7 Sue 1980-1981 0 %
8 Sue 1981-1982 50 %
9 Sue 1982-1983 50 %