如何提取基于组的向量以传递给 dplyr 的总结或变异中的函数?
How can I pull a group-based vector to pass to a function within dplyr's summarize or mutate?
我正在尝试使用 psych
程序包中的 AUC
函数创建准确性、灵敏度和特异性的摘要 table。我想为分组变量的每个级别定义输入向量(t,一个 4 x 1 向量)。
我试过的好像忽略了分组。
示例:
library(tidyverse)
library(psych)
Data <- data.frame(Class = c("A","B","C","D"),
TP = c(198,185,221,192),
FP = c(1,1,6,1),
FN = c(42,55,19,48),
TN = c(569,570,564,569))
Data %>%
group_by(Class) %>%
mutate(Accuracy = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Accuracy,
Sensitivity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Sensitivity,
Specificity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Specificity)
这让我接近正确的输出,除了准确性、灵敏度和特异性的值仅在第一行计算,然后重复:
# A tibble: 4 x 8
# Groups: Class [4]
Class TP FP FN TN Accuracy Sensitivity Specificity
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A 198 1 42 569 0.947 0.995 0.931
2 B 185 0 55 570 0.947 0.995 0.931
3 C 221 6 19 564 0.947 0.995 0.931
4 D 192 1 48 569 0.947 0.995 0.931
我也试过 summarize
:
Data %>%
group_by(Class) %>%
summarize(Accuracy = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Accuracy,
Sensitivity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Sensitivity,
Specificity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Specificity)
但是输出结果和上面一样
所需的输出是对 "Class"
的每个级别的唯一计算
# A tibble: 4 x 8
Class TP FP FN TN Accuracy Sensitivity Specificity
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A 198 1 42 569 0.95 0.99 0.93
2 B 185 0 55 570 0.93 0.99 0.91
3 C 221 6 19 564 0.97 0.97 0.97
4 D 192 1 48 569 0.94 0.99 0.92
如何在 summarize 或 mutate 中获取函数调用以维护组?
这个有效
Data %>%
group_by(Class) %>%
mutate(Accuracy = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Accuracy,
Sensitivity = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Sensitivity,
Specificity = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Specificity)
但也许这样更清楚
Data %>%
group_by(Class) %>%
mutate(Accuracy = AUC(t = c(TP, FP, FN, TN))$Accuracy,
Sensitivity = AUC(t = c(TP, FP, FN, TN))$Sensitivity,
Specificity = AUC(t = c(TP, FP, FN, TN))$Specificity)
为了避免为每个 class 多次调用 AUC
,我会编写一个包装器,如下所示:
# Load libraries
library(tidyverse)
library(psych)
# Create data frame
Data <- data.frame(Class = c("A","B","C","D"),
TP = c(198,185,221,192),
FP = c(1,1,6,1),
FN = c(42,55,19,48),
TN = c(569,570,564,569))
# Wrapper function
AUC_wrapper <- function(Class, TP, FP, FN, TN){
res <- AUC(t = c(TP, FP, FN, TN))
data.frame(Class = Class,
TP = TP,
FP = FP,
FN = FN,
TN = TN,
Accuracy = res$Accuracy,
Sensitivity = res$Sensitivity,
Specificity = res$Specificity)
}
# Run using purrr
pmap_dfr(Data, AUC_wrapper)
# Class TP FP FN TN Accuracy Sensitivity Specificity
# 1 A 198 1 42 569 0.9469136 0.9949749 0.9312602
# 2 B 185 1 55 570 0.9309494 0.9946237 0.9120000
# 3 C 221 6 19 564 0.9691358 0.9735683 0.9674099
# 4 D 192 1 48 569 0.9395062 0.9948187 0.9222042
我正在尝试使用 psych
程序包中的 AUC
函数创建准确性、灵敏度和特异性的摘要 table。我想为分组变量的每个级别定义输入向量(t,一个 4 x 1 向量)。
我试过的好像忽略了分组。
示例:
library(tidyverse)
library(psych)
Data <- data.frame(Class = c("A","B","C","D"),
TP = c(198,185,221,192),
FP = c(1,1,6,1),
FN = c(42,55,19,48),
TN = c(569,570,564,569))
Data %>%
group_by(Class) %>%
mutate(Accuracy = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Accuracy,
Sensitivity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Sensitivity,
Specificity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Specificity)
这让我接近正确的输出,除了准确性、灵敏度和特异性的值仅在第一行计算,然后重复:
# A tibble: 4 x 8
# Groups: Class [4]
Class TP FP FN TN Accuracy Sensitivity Specificity
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A 198 1 42 569 0.947 0.995 0.931
2 B 185 0 55 570 0.947 0.995 0.931
3 C 221 6 19 564 0.947 0.995 0.931
4 D 192 1 48 569 0.947 0.995 0.931
我也试过 summarize
:
Data %>%
group_by(Class) %>%
summarize(Accuracy = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Accuracy,
Sensitivity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Sensitivity,
Specificity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Specificity)
但是输出结果和上面一样
所需的输出是对 "Class"
的每个级别的唯一计算# A tibble: 4 x 8
Class TP FP FN TN Accuracy Sensitivity Specificity
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A 198 1 42 569 0.95 0.99 0.93
2 B 185 0 55 570 0.93 0.99 0.91
3 C 221 6 19 564 0.97 0.97 0.97
4 D 192 1 48 569 0.94 0.99 0.92
如何在 summarize 或 mutate 中获取函数调用以维护组?
这个有效
Data %>%
group_by(Class) %>%
mutate(Accuracy = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Accuracy,
Sensitivity = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Sensitivity,
Specificity = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Specificity)
但也许这样更清楚
Data %>%
group_by(Class) %>%
mutate(Accuracy = AUC(t = c(TP, FP, FN, TN))$Accuracy,
Sensitivity = AUC(t = c(TP, FP, FN, TN))$Sensitivity,
Specificity = AUC(t = c(TP, FP, FN, TN))$Specificity)
为了避免为每个 class 多次调用 AUC
,我会编写一个包装器,如下所示:
# Load libraries
library(tidyverse)
library(psych)
# Create data frame
Data <- data.frame(Class = c("A","B","C","D"),
TP = c(198,185,221,192),
FP = c(1,1,6,1),
FN = c(42,55,19,48),
TN = c(569,570,564,569))
# Wrapper function
AUC_wrapper <- function(Class, TP, FP, FN, TN){
res <- AUC(t = c(TP, FP, FN, TN))
data.frame(Class = Class,
TP = TP,
FP = FP,
FN = FN,
TN = TN,
Accuracy = res$Accuracy,
Sensitivity = res$Sensitivity,
Specificity = res$Specificity)
}
# Run using purrr
pmap_dfr(Data, AUC_wrapper)
# Class TP FP FN TN Accuracy Sensitivity Specificity
# 1 A 198 1 42 569 0.9469136 0.9949749 0.9312602
# 2 B 185 1 55 570 0.9309494 0.9946237 0.9120000
# 3 C 221 6 19 564 0.9691358 0.9735683 0.9674099
# 4 D 192 1 48 569 0.9395062 0.9948187 0.9222042