一起使用汇总、交叉和分位数函数

Using summarise, across, and quantile functions together

我正在尝试使用 mtcars 数据集来计算汇总统计数据。这是我的代码 -

df <- as_tibble(mtcars)


df.sum2 <- df %>%
  select(mpg, cyl, vs, am, gear, carb) %>% 
  mutate(across(where(is.factor), as.numeric)) %>% 
  summarise(across(
    .cols = everything(), 
    .fns = list(
                Min = min, 
                Q25 = quantile (., 0.25), 
                Median = median, 
                Q75 = quantile (., 0.75), 
                Max = max,
                Mean = mean, 
                StdDev = sd,
                N = n()
                ), na.rm = T,
   .names = "{col}_{fn}"
                   )
            )

但是我得到了以下错误-

Error: Problem with summarise() input ..1. x Can't subset columns that don't exist. x Locations 65, 66, 69, 71, 76, etc. don't exist. i There are only 6 columns. i Input ..1 is across(...).

如果我从上面的代码中取出 Q25 = quantile (.,0.25)Q75 = quantile (.,0.75),它就可以正常工作。实际上,我可以使用以下代码得到预期的结果 -

df.sum <- df %>%
  select(mpg, cyl, vs, am, gear, carb) %>% # select variables to summarise
  summarise_each(funs(Min = min, 
                      Q25 = quantile (., 0.25), 
                      Median = median, 
                      Q75 = quantile (., 0.75), 
                      Max = max,
                      Mean = mean, 
                      StdDev = sd,
                      N = n()))

但是我想把across函数和summarise函数一起使用。我不想使用 summarise_each 函数。

您需要在传递附加参数时使用匿名函数或公式语法。尝试

library(dplyr)

df.sum2 <- df %>%
  select(mpg, cyl, vs, am, gear, carb) %>% 
  mutate(across(where(is.factor), as.numeric)) %>% 
  summarise(across(
    .cols = everything(), 
    .fns = list(
      Min = min, 
      Q25 = ~quantile(., 0.25), 
      Median = median, 
      Q75 = ~quantile(., 0.75), 
      Max = max,
      Mean = mean, 
      StdDev = sd,
      N = ~n()
    ),
    .names = "{col}_{fn}"
  )
  )