如何提取基于组的向量以传递给 dplyr 的总结或变异中的函数?

How can I pull a group-based vector to pass to a function within dplyr's summarize or mutate?

我正在尝试使用 psych 程序包中的 AUC 函数创建准确性、灵敏度和特异性的摘要 table。我想为分组变量的每个级别定义输入向量(t,一个 4 x 1 向量)。

我试过的好像忽略了分组。

示例:

library(tidyverse)
library(psych)

Data <- data.frame(Class = c("A","B","C","D"),
                   TP = c(198,185,221,192),
                   FP = c(1,1,6,1),
                   FN = c(42,55,19,48),
                   TN = c(569,570,564,569))

Data %>% 
  group_by(Class) %>%
  mutate(Accuracy = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Accuracy,
         Sensitivity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Sensitivity,
         Specificity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Specificity)

这让我接近正确的输出,除了准确性、灵敏度和特异性的值仅在第一行计算,然后重复:

# A tibble: 4 x 8
# Groups:   Class [4]
  Class    TP    FP    FN    TN Accuracy Sensitivity Specificity
  <fct> <dbl> <dbl> <dbl> <dbl>    <dbl>       <dbl>       <dbl>
1 A       198     1    42   569    0.947       0.995       0.931
2 B       185     0    55   570    0.947       0.995       0.931
3 C       221     6    19   564    0.947       0.995       0.931
4 D       192     1    48   569    0.947       0.995       0.931

我也试过 summarize:

Data %>% 
  group_by(Class) %>%
  summarize(Accuracy = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Accuracy,
         Sensitivity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Sensitivity,
         Specificity = AUC(t = unlist(.[1,2:5], use.names=FALSE))$Specificity)

但是输出结果和上面一样

所需的输出是对 "Class"

的每个级别的唯一计算
# A tibble: 4 x 8
  Class    TP    FP    FN    TN Accuracy Sensitivity Specificity
  <fct> <dbl> <dbl> <dbl> <dbl>    <dbl>       <dbl>       <dbl>
1 A       198     1    42   569     0.95        0.99        0.93
2 B       185     0    55   570     0.93        0.99        0.91
3 C       221     6    19   564     0.97        0.97        0.97
4 D       192     1    48   569     0.94        0.99        0.92

如何在 summarize 或 mutate 中获取函数调用以维护组?

这个有效

Data %>% 
  group_by(Class) %>%
  mutate(Accuracy = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Accuracy,
         Sensitivity = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Sensitivity,
         Specificity = AUC(t = unlist(.[Class,2:5], use.names=FALSE))$Specificity)

但也许这样更清楚

Data %>% 
  group_by(Class) %>%
  mutate(Accuracy = AUC(t = c(TP, FP, FN, TN))$Accuracy,
         Sensitivity = AUC(t = c(TP, FP, FN, TN))$Sensitivity,
         Specificity = AUC(t = c(TP, FP, FN, TN))$Specificity)

为了避免为每个 class 多次调用 AUC,我会编写一个包装器,如下所示:

# Load libraries
library(tidyverse)
library(psych)

# Create data frame
Data <- data.frame(Class = c("A","B","C","D"),
                   TP = c(198,185,221,192),
                   FP = c(1,1,6,1),
                   FN = c(42,55,19,48),
                   TN = c(569,570,564,569))

# Wrapper function
AUC_wrapper <- function(Class, TP, FP, FN, TN){
  res <- AUC(t = c(TP, FP, FN, TN))
  data.frame(Class = Class, 
             TP = TP,
             FP = FP,
             FN = FN,
             TN = TN,
             Accuracy = res$Accuracy, 
             Sensitivity = res$Sensitivity, 
             Specificity = res$Specificity)
}

# Run using purrr
pmap_dfr(Data, AUC_wrapper)

#   Class  TP FP FN  TN  Accuracy Sensitivity Specificity
# 1     A 198  1 42 569 0.9469136   0.9949749   0.9312602
# 2     B 185  1 55 570 0.9309494   0.9946237   0.9120000
# 3     C 221  6 19 564 0.9691358   0.9735683   0.9674099
# 4     D 192  1 48 569 0.9395062   0.9948187   0.9222042