如何按 R 中的多列分组?

How to group by multiple columns in R?

我需要按三列 - 性别、年份和就业状况对我的数据进行分组。

这是我的数据:

ID <- c(1000, 1000, 1000, 1001, 1001, 1001, 1001, 1001, 1002, 1002, 1002, 1002, 1002)
Gender <- as.factor(c("M","M","M","M","M","M","M","M","F","F","F","F","F"))
Employment_status <- as.factor(c("Other","Other","Other","Employed","Employed","Employed","Employed","Employed","Employed","Employed","Employed","Employed","Unemployed"))
Year <- c(2016, 2017, 2018, 2016, 2017, 2018, 2019, 2020, 2016, 2017, 2018, 2019, 2020)

my_data <- data.frame(ID, Gender, Employment_status, Year, stringsAsFactors=F)

我希望我的最终结果包含 table 有关按性别和年份划分的就业率的数据。我如何在 R 中实现这一目标?

预期的输出是这样的:

谢谢!

在基础 R 中你可以这样做:

ftable(prop.table(table(my_data[-1]), c(1, 3)), col.vars = c("Gender", "Employment_status"))


     Gender                   F                         M                 
     Employment_status Employed Other Unemployed Employed Other Unemployed
Year                                                                      
2016                        1.0   0.0        0.0      0.5   0.5        0.0
2017                        1.0   0.0        0.0      0.5   0.5        0.0
2018                        1.0   0.0        0.0      0.5   0.5        0.0
2019                        1.0   0.0        0.0      1.0   0.0        0.0
2020                        0.0   0.0        1.0      1.0   0.0        0.0

这就是您的大致目标吗?

library(dplyr)


my_data %>% 
  group_by(Gender, Year) %>% 
  count(Employment_status) %>% 
  summarise(sum(n)) %>% 
  arrange(Year)

输出:

   Gender  Year `sum(n)`
   <fct>  <dbl>    <int>
 1 F       2016        1
 2 M       2016        2
 3 F       2017        1
 4 M       2017        2
 5 F       2018        1
 6 M       2018        2
 7 F       2019        1
 8 M       2019        1
 9 F       2020        1
10 M       2020        1