建立一个平衡的面板,保留每年重复的观察结果

Build a balanced panel keeping observations that are repeated every year

我只想保留多年来的完整观察结果,我该如何进行?

我有以下例子:

structure(list(variable = c(1, 1, 1, 2, 2, 3, 3, 3, 4, 4, 5, 
5, 5), Year = c(2010, 2011, 2012, 2010, 2012, 2010, 2011, 2012, 
2011, 2012, 2010, 2011, 2012)), class = "data.frame", row.names = c(NA, 
-13L))

我想得到:

structure(list(variable = c(1, 1, 1, 3, 3, 3, 5, 5, 5), Year = c(2010, 
2011, 2012, 2010, 2011, 2012, 2010, 2011, 2012)), row.names = c(1L, 
2L, 3L, 6L, 7L, 8L, 11L, 12L, 13L), class = "data.frame")

这个例子很简单,但我需要为一个庞大的数据集做这个,以构建一个平衡的仪表板。感谢您的帮助。

base R中,我们可以使用subsettable

yr <- unique(df$Year)
subset(df, variable %in% names(which(table(variable[Year %in% yr]) == 
       length(yr))))

或与dplyr,按'variable'分组,filter那些具有不同'Year'(n_distinct)个数的变量与整个变量相同数据

library(dplyr)
df %>%
    group_by(variable) %>% 
    filter(n_distinct(Year) == n_distinct(.$Year)) %>% 
    ungroup
# A tibble: 9 x 2
  variable  Year
     <dbl> <dbl>
1        1  2010
2        1  2011
3        1  2012
4        3  2010
5        3  2011
6        3  2012
7        5  2010
8        5  2011
9        5  2012