R 中每个组的总和和条件主元

Question

您好，我正在尝试在 R 中按组进行数据透视总和。在我的数据框下方 (my_df)

my_df <- data.frame(Municipality=c('a', 'a', 'a', 'a', 'a', 'a', 'a','a','b','b'),
                 section=c(1, 1, 1, 1, 1, 1, 2,2,1,1),
                 state=c('ac', 'ac', 'ac', 'ac', 'ac', 'ac', 'ac','ac','mg','mg'),
                 gender=c('male', 'male', 'female', 'male', 'female', 'female', 
                          'male','female','female','female'),
                 age=c('60-64', '17', '18', '18', '21-24', '30-34', '19','40-44','60-64','50-54'),
                 age_code=c('6064', '1700', '1800', '1800', '2124', '3034', 
                            '1900','4044','6064','5054'),
                 schooling=c('read_write', 'high school', 'elementary', 'elementary', 'illiterate', 'college', 'elementary incomplete','high school incomplete','illiterate', 'elementary'),
                 schooling_code=c(2, 4, 3, 3, 1, 5, 3.5, 4.5, 1, 2),
                 num_voters=c(1, 4, 2, 3, 9, 10, 1, 8, 11, 3))

df_desired <- data.frame(Municipality=c('a', 'a','b'),
                 section=c(1,2,1),
                 state=c('ac', 'ac','mg'),
                 tot_elect=c(27, 9, 14),
                 tot_male=c(8, 1, 0),
                 share_male=c(0.29,0.11,0),
                 share_illiterate=c(0.33, 0, 0.78),
                 share_60_64=c(0.03, 0, 0.78))

df_desired 是我想要得到的。非常感谢您的帮助。

到目前为止，我使用了

aggregate(df$num_voters, by=list(df$Municipality,df$section,df$state), FUN=sum)

Answer 1

library(dplyr)
my_df %>%
  group_by(Municipality, section, state) %>%
  summarize(tot_elect = sum(num_voters),
            tot_male = sum(num_voters[gender == "male"]),
            share_male = tot_male / tot_elect,
            share_illit = sum(num_voters[schooling == "illiterate"] / tot_elect),
            share_60_64 = sum(num_voters[age == "60-64"] / tot_elect))

R 中每个组的总和和条件主元

Sum by and conditional pivot each group in R

group-by

r

sum