r 按组和组内汇总

r summary by group and within groups

如果这是我的数据集,按SubjectTest排列

ID    Subjects   Test   Score     Results
1     English    1      78        Pass
2     English    1      98        Pass    

2     English    2      81        Pass
3     English    2      81        Pass

2     English    3      15        Fail 
3     English    3      74        Pass

4     Physics    1      34        Fail
2     Physics    1      79        Pass

4     Physics    2      74        Fail
3     Physics    2      81        Pass   
3     Physics    2      81        Pass

4     Physics    3      48        Fail    
2     Physics    3      15        Fail
3     Physics    3      74        Pass     

我有兴趣创建这样的摘要

           Test1                   Test2                  Test3
Subject    FailAverge   %Fail      FailAverge   %Fail     FailAverge   %Fail
English    0            0          0            0         15           50
Physics    34           50         74           33%       31.5         66

非常感谢任何帮助,谢谢。

我尝试使用 tidyverse 原则。要获得准确的格式,您可能需要一些 table 软件包(例如 GT),但下面的内容让您接近。

我将数据汇总到一个新的数据框中,然后使用更宽的数据透视表将行变成列,最后做了一些小的整理。

#recreate the table
df <- tribble(
~ID,    ~Subjects,   ~Test,   ~Score,     ~Results,
1,     "English",    1,      78,        "Pass",
2,     "English",    1,      98,        "Pass",    
2,     "English",    2,      81,        "Pass",
3,     "English",    2,      81,        "Pass",
2,     "English",    3,      15,        "Fail", 
3,     "English",    3,      74,        "Pass",
4,     "Physics",    1,      34,        "Fail",
2,     "Physics",    1,      79,        "Pass",
4,     "Physics",    2,      74,        "Fail",
3,     "Physics",    2,      81,        "Pass",   
3,     "Physics",    2,      81,        "Pass",
4,     "Physics",    3,      48,        "Fail",    
2,    "Physics",   3,      15,        "Fail",
3,     "Physics",    3,      74,        "Pass") 

#create table to summarize the grouped data
df_fail <- df %>% 
  group_by(Subjects,Test) %>% 
  summarize(FailAverage=mean(Score[Results=="Fail"]),
            Failper=mean(Results=="Fail",na.rm=TRUE))


#pivot wider the values, arrange the columns in order and then did some renaming
df_fail %>% pivot_wider(names_from = c(Test),
                        values_from = c(FailAverage,Failper)) %>%
  relocate(Subjects,contains("1"),contains("2"),contains("3")) %>%
  rename_with(.cols = c(-Subjects),.fn = ~gsub("_", "_test", .x))