使用 dplyr 和多个语句的函数
Function with dplyr and multiple statements
我在一个函数中使用多个 dplyr 函数时遇到问题,尽管使用了函数变体。
例子
library(dplyr)
# Data:
mydf <- data.frame(
var1 = factor(rep(1:24, each = 100)),
var2 = runif(2400, min = -10, max = 125),
var3 = runif(2400, min = 0, max = 2500),
var4 = runif(2400, min = - 10, max = 25)
)
# The function I want to build:
fx.average <- function(df, varlist) {
# select some varibles from a data frame
df <- df %>% dplyr::select_(.dots = varlist)
# Group by a variable and then just calculate the mean
df <-df %>% dplyr::group_by_(var1) %>% # added df here
dplyr::summarise_each_(funs_(mean(., na.rm = TRUE)))
}
所以,现在我要测试下面的功能:
# Test function, Setup var-list
varlist0 <- c("var1", "var2", "var3")
fx.average(mydf, varlist0)
# Error in dplyr::group_by_(var1) : object 'var1' not found
# object 'var1' not found
# Manual example
mydf %>% dplyr::select(var1, var2, var3) %>%
group_by(var1) %>%
summarise_each(funs(mean(., na.rm = TRUE)))
不确定哪里出了问题?从其他问题来看,似乎应该通过在函数中添加下划线来解决——因为它们是为在函数内部使用而构建的?
在 OP 的代码中,有一些拼写错误(未在 group_by
步骤中指定数据,使用不带引号的字符串的 NSE 以及使用 funs_
和 summarise_each_
,其中 summarise_each
和 funs
有效)
fx.average <- function(df, varlist) {
df %>%
dplyr::select_(.dots = varlist) %>%
dplyr::group_by_(.dots = "var1") %>%
dplyr::summarise_each(funs(mean(., na.rm = TRUE)))
}
fx.average(mydf, varlist0)
# A tibble: 24 × 3
# var1 var2 var3
# <fctr> <dbl> <dbl>
#1 1 55.13601 1141.021
#2 2 59.16508 1155.226
#3 3 59.64524 1245.043
#4 4 60.12310 1284.808
#5 5 57.65874 1221.771
#6 6 58.86611 1266.026
#7 7 66.13987 1303.927
#8 8 54.21595 1303.638
#9 9 63.84230 1280.380
#10 10 49.15238 1236.456
# ... with 14 more rows
我在一个函数中使用多个 dplyr 函数时遇到问题,尽管使用了函数变体。
例子
library(dplyr)
# Data:
mydf <- data.frame(
var1 = factor(rep(1:24, each = 100)),
var2 = runif(2400, min = -10, max = 125),
var3 = runif(2400, min = 0, max = 2500),
var4 = runif(2400, min = - 10, max = 25)
)
# The function I want to build:
fx.average <- function(df, varlist) {
# select some varibles from a data frame
df <- df %>% dplyr::select_(.dots = varlist)
# Group by a variable and then just calculate the mean
df <-df %>% dplyr::group_by_(var1) %>% # added df here
dplyr::summarise_each_(funs_(mean(., na.rm = TRUE)))
}
所以,现在我要测试下面的功能:
# Test function, Setup var-list
varlist0 <- c("var1", "var2", "var3")
fx.average(mydf, varlist0)
# Error in dplyr::group_by_(var1) : object 'var1' not found
# object 'var1' not found
# Manual example
mydf %>% dplyr::select(var1, var2, var3) %>%
group_by(var1) %>%
summarise_each(funs(mean(., na.rm = TRUE)))
不确定哪里出了问题?从其他问题来看,似乎应该通过在函数中添加下划线来解决——因为它们是为在函数内部使用而构建的?
在 OP 的代码中,有一些拼写错误(未在 group_by
步骤中指定数据,使用不带引号的字符串的 NSE 以及使用 funs_
和 summarise_each_
,其中 summarise_each
和 funs
有效)
fx.average <- function(df, varlist) {
df %>%
dplyr::select_(.dots = varlist) %>%
dplyr::group_by_(.dots = "var1") %>%
dplyr::summarise_each(funs(mean(., na.rm = TRUE)))
}
fx.average(mydf, varlist0)
# A tibble: 24 × 3
# var1 var2 var3
# <fctr> <dbl> <dbl>
#1 1 55.13601 1141.021
#2 2 59.16508 1155.226
#3 3 59.64524 1245.043
#4 4 60.12310 1284.808
#5 5 57.65874 1221.771
#6 6 58.86611 1266.026
#7 7 66.13987 1303.927
#8 8 54.21595 1303.638
#9 9 63.84230 1280.380
#10 10 49.15238 1236.456
# ... with 14 more rows