在将 df 和列作为输入的函数上使用 lapply - 按百分比重新排序
Use lapply on a function that takes df and column as inputs - reorder by percentage
我有一个教育数据框,其中包含大约 40 个特征,我想将这些特征与名为 'Tutor' 的单个列进行交叉制表并生成百分比。然后我希望该行从每一列的 "Yes" 百分比属性
降序排列
library(purrr)
library(tidyverse)
library(janitor)
toydat <- data.frame(ID = c(1:12), Learning_mode = rep(c("External","Internal"), times =6),
Subject = rep(c("Maths","English","Chemistry", "Physics"), each = 3 ),
Grade = runif(12, 0,100), Tutor = rep(c("Yes","No"), times = 6))
toydat %>%
select_if(~is.factor(.) |is.character(.)) %>%
select (-Tutor) %>%
imap(.f = ~janitor::tabyl(toydat, !!sym(.y), Tutor, sort = TRUE)) %>%
adorn_totals(c("row", "col")) %>%
adorn_percentages("row") %>%
adorn_pct_formatting(rounding = "half up", digits = 0) %>%
adorn_ns() %>%
adorn_title("combined")
我最终会将其放入 Rmarkdown 报告中,该报告将表明功能如何影响参与辅导。输出是这样的:
$Learning_mode
Learning_mode/Tutor No Yes Total
External 0% (0) 100% (6) 100% (6)
Internal 100% (6) 0% (0) 100% (6)
Total 50% (6) 50% (6) 100% (12)
$Subject
Subject/Tutor No Yes Total
Chemistry 33% (1) 67% (2) 100% (3)
English 67% (2) 33% (1) 100% (3)
Maths 33% (1) 67% (2) 100% (3)
Physics 67% (2) 33% (1) 100% (3)
Total 50% (6) 50% (6) 100% (12)
但我希望它像这样 - 按每个属性的 "Yes" 百分比列下降:
$Learning_mode
Learning_mode/Tutor No Yes Total
External 0% (0) 100% (6) 100% (6)
Internal 100% (6) 0% (0) 100% (6)
Total 50% (6) 50% (6) 100% (12)
$Subject
Subject/Tutor No Yes Total
Chemistry 33% (1) 67% (2) 100% (3)
Maths 33% (1) 67% (2) 100% (3)
English 67% (2) 33% (1) 100% (3)
Physics 67% (2) 33% (1) 100% (3)
Total 50% (6) 50% (6) 100% (12)
我尝试制作一个函数,根据针对 Tutor 的百分比重新调整每个字符属性,并且它有效,但我无法使用 lapply 将其应用于我的数据框。要重新调整的函数:
newlevels <-function(x){
tab <- table(toydat[[x]], toydat$Tutor)
tab <-as.data.frame(prop.table(tab, 1))
tab <-as.data.frame(tab)
tab <- tab %>% filter(Var2 =="Yes")
tab <- tab[order(-tab$Freq),]
ordered <- as.character(tab$Var1)
}
toydat$Subject <- factor(toydat$Subject, level = newlevels("Subject"))
如何应用 newlevels 函数对数据框中的所有非数字列重新排序。目前它不工作:
cols <- c("Subject","Learning_mode")
toydat[cols] <-lapply(toydat[cols], function(x) {factor(x), levels = newlevels(x)})
我收到以下错误:
Error: unexpected ',' in "toydat[cols] <-lapply(toydat[cols], function(x) {factor(x),"
可能最容易在管道中使用 arrange()
:
toydat %>%
select_if(~is.factor(.) |is.character(.)) %>%
select (-Tutor) %>%
imap(.f = ~janitor::tabyl(toydat, !!sym(.y), Tutor, sort = TRUE)) %>%
map(~arrange(.x, desc(Yes))) %>%
adorn_totals(c("row", "col")) %>%
adorn_percentages("row") %>%
adorn_pct_formatting(rounding = "half up", digits = 0) %>%
adorn_ns() %>%
adorn_title("combined")
$Learning_mode
Learning_mode/Tutor No Yes Total
External 0% (0) 100% (6) 100% (6)
Internal 100% (6) 0% (0) 100% (6)
Total 50% (6) 50% (6) 100% (12)
$Subject
Subject/Tutor No Yes Total
Chemistry 33% (1) 67% (2) 100% (3)
Maths 33% (1) 67% (2) 100% (3)
English 67% (2) 33% (1) 100% (3)
Physics 67% (2) 33% (1) 100% (3)
Total 50% (6) 50% (6) 100% (12)
我有一个教育数据框,其中包含大约 40 个特征,我想将这些特征与名为 'Tutor' 的单个列进行交叉制表并生成百分比。然后我希望该行从每一列的 "Yes" 百分比属性
降序排列library(purrr)
library(tidyverse)
library(janitor)
toydat <- data.frame(ID = c(1:12), Learning_mode = rep(c("External","Internal"), times =6),
Subject = rep(c("Maths","English","Chemistry", "Physics"), each = 3 ),
Grade = runif(12, 0,100), Tutor = rep(c("Yes","No"), times = 6))
toydat %>%
select_if(~is.factor(.) |is.character(.)) %>%
select (-Tutor) %>%
imap(.f = ~janitor::tabyl(toydat, !!sym(.y), Tutor, sort = TRUE)) %>%
adorn_totals(c("row", "col")) %>%
adorn_percentages("row") %>%
adorn_pct_formatting(rounding = "half up", digits = 0) %>%
adorn_ns() %>%
adorn_title("combined")
我最终会将其放入 Rmarkdown 报告中,该报告将表明功能如何影响参与辅导。输出是这样的:
$Learning_mode
Learning_mode/Tutor No Yes Total
External 0% (0) 100% (6) 100% (6)
Internal 100% (6) 0% (0) 100% (6)
Total 50% (6) 50% (6) 100% (12)
$Subject
Subject/Tutor No Yes Total
Chemistry 33% (1) 67% (2) 100% (3)
English 67% (2) 33% (1) 100% (3)
Maths 33% (1) 67% (2) 100% (3)
Physics 67% (2) 33% (1) 100% (3)
Total 50% (6) 50% (6) 100% (12)
但我希望它像这样 - 按每个属性的 "Yes" 百分比列下降:
$Learning_mode
Learning_mode/Tutor No Yes Total
External 0% (0) 100% (6) 100% (6)
Internal 100% (6) 0% (0) 100% (6)
Total 50% (6) 50% (6) 100% (12)
$Subject
Subject/Tutor No Yes Total
Chemistry 33% (1) 67% (2) 100% (3)
Maths 33% (1) 67% (2) 100% (3)
English 67% (2) 33% (1) 100% (3)
Physics 67% (2) 33% (1) 100% (3)
Total 50% (6) 50% (6) 100% (12)
我尝试制作一个函数,根据针对 Tutor 的百分比重新调整每个字符属性,并且它有效,但我无法使用 lapply 将其应用于我的数据框。要重新调整的函数:
newlevels <-function(x){
tab <- table(toydat[[x]], toydat$Tutor)
tab <-as.data.frame(prop.table(tab, 1))
tab <-as.data.frame(tab)
tab <- tab %>% filter(Var2 =="Yes")
tab <- tab[order(-tab$Freq),]
ordered <- as.character(tab$Var1)
}
toydat$Subject <- factor(toydat$Subject, level = newlevels("Subject"))
如何应用 newlevels 函数对数据框中的所有非数字列重新排序。目前它不工作:
cols <- c("Subject","Learning_mode")
toydat[cols] <-lapply(toydat[cols], function(x) {factor(x), levels = newlevels(x)})
我收到以下错误:
Error: unexpected ',' in "toydat[cols] <-lapply(toydat[cols], function(x) {factor(x),"
可能最容易在管道中使用 arrange()
:
toydat %>%
select_if(~is.factor(.) |is.character(.)) %>%
select (-Tutor) %>%
imap(.f = ~janitor::tabyl(toydat, !!sym(.y), Tutor, sort = TRUE)) %>%
map(~arrange(.x, desc(Yes))) %>%
adorn_totals(c("row", "col")) %>%
adorn_percentages("row") %>%
adorn_pct_formatting(rounding = "half up", digits = 0) %>%
adorn_ns() %>%
adorn_title("combined")
$Learning_mode
Learning_mode/Tutor No Yes Total
External 0% (0) 100% (6) 100% (6)
Internal 100% (6) 0% (0) 100% (6)
Total 50% (6) 50% (6) 100% (12)
$Subject
Subject/Tutor No Yes Total
Chemistry 33% (1) 67% (2) 100% (3)
Maths 33% (1) 67% (2) 100% (3)
English 67% (2) 33% (1) 100% (3)
Physics 67% (2) 33% (1) 100% (3)
Total 50% (6) 50% (6) 100% (12)