一起使用汇总、交叉和分位数函数
Using summarise, across, and quantile functions together
我正在尝试使用 mtcars
数据集来计算汇总统计数据。这是我的代码 -
df <- as_tibble(mtcars)
df.sum2 <- df %>%
select(mpg, cyl, vs, am, gear, carb) %>%
mutate(across(where(is.factor), as.numeric)) %>%
summarise(across(
.cols = everything(),
.fns = list(
Min = min,
Q25 = quantile (., 0.25),
Median = median,
Q75 = quantile (., 0.75),
Max = max,
Mean = mean,
StdDev = sd,
N = n()
), na.rm = T,
.names = "{col}_{fn}"
)
)
但是我得到了以下错误-
Error: Problem with summarise()
input ..1
.
x Can't subset columns that don't exist.
x Locations 65, 66, 69, 71, 76, etc. don't exist.
i There are only 6 columns.
i Input ..1
is across(...)
.
如果我从上面的代码中取出 Q25 = quantile (.,0.25)
和 Q75 = quantile (.,0.75)
,它就可以正常工作。实际上,我可以使用以下代码得到预期的结果 -
df.sum <- df %>%
select(mpg, cyl, vs, am, gear, carb) %>% # select variables to summarise
summarise_each(funs(Min = min,
Q25 = quantile (., 0.25),
Median = median,
Q75 = quantile (., 0.75),
Max = max,
Mean = mean,
StdDev = sd,
N = n()))
但是我想把across
函数和summarise
函数一起使用。我不想使用 summarise_each
函数。
您需要在传递附加参数时使用匿名函数或公式语法。尝试
library(dplyr)
df.sum2 <- df %>%
select(mpg, cyl, vs, am, gear, carb) %>%
mutate(across(where(is.factor), as.numeric)) %>%
summarise(across(
.cols = everything(),
.fns = list(
Min = min,
Q25 = ~quantile(., 0.25),
Median = median,
Q75 = ~quantile(., 0.75),
Max = max,
Mean = mean,
StdDev = sd,
N = ~n()
),
.names = "{col}_{fn}"
)
)
我正在尝试使用 mtcars
数据集来计算汇总统计数据。这是我的代码 -
df <- as_tibble(mtcars)
df.sum2 <- df %>%
select(mpg, cyl, vs, am, gear, carb) %>%
mutate(across(where(is.factor), as.numeric)) %>%
summarise(across(
.cols = everything(),
.fns = list(
Min = min,
Q25 = quantile (., 0.25),
Median = median,
Q75 = quantile (., 0.75),
Max = max,
Mean = mean,
StdDev = sd,
N = n()
), na.rm = T,
.names = "{col}_{fn}"
)
)
但是我得到了以下错误-
Error: Problem with
summarise()
input..1
. x Can't subset columns that don't exist. x Locations 65, 66, 69, 71, 76, etc. don't exist. i There are only 6 columns. i Input..1
isacross(...)
.
如果我从上面的代码中取出 Q25 = quantile (.,0.25)
和 Q75 = quantile (.,0.75)
,它就可以正常工作。实际上,我可以使用以下代码得到预期的结果 -
df.sum <- df %>%
select(mpg, cyl, vs, am, gear, carb) %>% # select variables to summarise
summarise_each(funs(Min = min,
Q25 = quantile (., 0.25),
Median = median,
Q75 = quantile (., 0.75),
Max = max,
Mean = mean,
StdDev = sd,
N = n()))
但是我想把across
函数和summarise
函数一起使用。我不想使用 summarise_each
函数。
您需要在传递附加参数时使用匿名函数或公式语法。尝试
library(dplyr)
df.sum2 <- df %>%
select(mpg, cyl, vs, am, gear, carb) %>%
mutate(across(where(is.factor), as.numeric)) %>%
summarise(across(
.cols = everything(),
.fns = list(
Min = min,
Q25 = ~quantile(., 0.25),
Median = median,
Q75 = ~quantile(., 0.75),
Max = max,
Mean = mean,
StdDev = sd,
N = ~n()
),
.names = "{col}_{fn}"
)
)