将函数应用于环境中的所有 data.frames
Applying a function to all data.frames in the environment
我想在我的环境中对所有 data.frames 使用下面的 cleanfunction
。
cleanfunction <- function(dataframe) {
dataframe <- as.data.frame(dataframe)
## get mode of all vars
var_mode <- sapply(dataframe, mode)
## produce error if complex or raw is found
if (any(var_mode %in% c("complex", "raw"))) stop("complex or raw not allowed!")
## get class of all vars
var_class <- sapply(dataframe, class)
## produce error if an "AsIs" object has "logical" or "character" mode
if (any(var_mode[var_class == "AsIs"] %in% c("logical", "character"))) {
stop("matrix variables with 'AsIs' class must be 'numeric'")
}
## identify columns that needs be coerced to factors
ind1 <- which(var_mode %in% c("logical", "character"))
## coerce logical / character to factor with `as.factor`
dataframe[ind1] <- lapply(dataframe[ind1], as.factor)
return(dataframe)
}
set.seed(10238)
DT = data.table(
A = rep(1:3, each = 5L),
B = rep(1:5, 3L),
C = sample(15L),
D = sample(15L)
)
DT_II <- copy(DT)
dfs <- ls()
现在我想将此函数应用于环境中的所有 df。我已经尝试了大约十件事,但我无法获得正确的语法..
for (i in seq_along(dfs)) {
get(dfs[i])[ , lapply(.SD, cleanfunction)]
}
编辑:
我找到了 ,但它没有存储结果。
eapply(globalenv(), function(x) if (is.data.frame(x)) cleanfunction(x))
如何将结果存储在每个对象中?
你 get(dfs[i])
其中 return 是对 data.table
的引用,但是你 lapply
-ing 该帧的每一列,我从您期望完整帧的函数参数 dataframe
。开头可能是:
for (i in seq_along(dfs)) {
get(dfs[i])[ , cleanfunction(.SD)]
}
但意识到此操作 return 是一个新框架,它不使用规范的 data.table
机制来更新数据 in-place。我建议你更新你的函数以始终强制 data.table
并参考地处理它。
cleanfunction <- function(dataframe) {
setDT(dataframe)
## get mode of all vars
var_mode <- sapply(dataframe, mode)
## produce error if complex or raw is found
if (any(var_mode %in% c("complex", "raw"))) stop("complex or raw not allowed!")
## get class of all vars
var_class <- sapply(dataframe, class)
## produce error if an "AsIs" object has "logical" or "character" mode
if (any(var_mode[var_class == "AsIs"] %in% c("logical", "character"))) {
stop("matrix variables with 'AsIs' class must be 'numeric'")
}
## identify columns that needs be coerced to factors
ind1 <- which(var_mode %in% c("logical", "character"))
## coerce logical / character to factor with `as.factor`
if (length(ind1)) dataframe[, c(ind1) := lapply(.SD, as.factor), .SDcols = ind1]
return(dataframe)
}
由于您当前的数据没有触发任何变化,我将更新一个:
DT[,quux:="A"]
head(DT)
# A B C D quux
# <int> <int> <int> <int> <char>
# 1: 1 1 12 15 A
# 2: 1 2 4 6 A
# 3: 1 3 5 7 A
# 4: 1 4 9 1 A
# 5: 1 5 6 14 A
# 6: 2 1 15 13 A
for (i in seq_along(dfs)) cleanfunction(get(dfs[i]))
head(DT)
# A B C D quux
# <int> <int> <int> <int> <fctr>
# 1: 1 1 12 15 A
# 2: 1 2 4 6 A
# 3: 1 3 5 7 A
# 4: 1 4 9 1 A
# 5: 1 5 6 14 A
# 6: 2 1 15 13 A
请注意,for
循环仅依赖于引用更新;来自 cleanfunction
的 return 值在这里被忽略。
由于 data.table
引用语义,此方法完全有效;如果您使用 data.frame
或 tbl_df
,这可能需要使用 assign(dfs[i], cleanfunction(..))
.
包装对 cleanfunction(.)
的调用
这对你有用吗?:
# store all dataframes from environment a list
dfs <- Filter(function(x) is(x, "data.frame"), mget(ls()))
#then apply your function
lapply(dfs, cleanfunction)
我想在我的环境中对所有 data.frames 使用下面的 cleanfunction
。
cleanfunction <- function(dataframe) {
dataframe <- as.data.frame(dataframe)
## get mode of all vars
var_mode <- sapply(dataframe, mode)
## produce error if complex or raw is found
if (any(var_mode %in% c("complex", "raw"))) stop("complex or raw not allowed!")
## get class of all vars
var_class <- sapply(dataframe, class)
## produce error if an "AsIs" object has "logical" or "character" mode
if (any(var_mode[var_class == "AsIs"] %in% c("logical", "character"))) {
stop("matrix variables with 'AsIs' class must be 'numeric'")
}
## identify columns that needs be coerced to factors
ind1 <- which(var_mode %in% c("logical", "character"))
## coerce logical / character to factor with `as.factor`
dataframe[ind1] <- lapply(dataframe[ind1], as.factor)
return(dataframe)
}
set.seed(10238)
DT = data.table(
A = rep(1:3, each = 5L),
B = rep(1:5, 3L),
C = sample(15L),
D = sample(15L)
)
DT_II <- copy(DT)
dfs <- ls()
现在我想将此函数应用于环境中的所有 df。我已经尝试了大约十件事,但我无法获得正确的语法..
for (i in seq_along(dfs)) {
get(dfs[i])[ , lapply(.SD, cleanfunction)]
}
编辑:
我找到了
eapply(globalenv(), function(x) if (is.data.frame(x)) cleanfunction(x))
如何将结果存储在每个对象中?
你 get(dfs[i])
其中 return 是对 data.table
的引用,但是你 lapply
-ing 该帧的每一列,我从您期望完整帧的函数参数 dataframe
。开头可能是:
for (i in seq_along(dfs)) {
get(dfs[i])[ , cleanfunction(.SD)]
}
但意识到此操作 return 是一个新框架,它不使用规范的 data.table
机制来更新数据 in-place。我建议你更新你的函数以始终强制 data.table
并参考地处理它。
cleanfunction <- function(dataframe) {
setDT(dataframe)
## get mode of all vars
var_mode <- sapply(dataframe, mode)
## produce error if complex or raw is found
if (any(var_mode %in% c("complex", "raw"))) stop("complex or raw not allowed!")
## get class of all vars
var_class <- sapply(dataframe, class)
## produce error if an "AsIs" object has "logical" or "character" mode
if (any(var_mode[var_class == "AsIs"] %in% c("logical", "character"))) {
stop("matrix variables with 'AsIs' class must be 'numeric'")
}
## identify columns that needs be coerced to factors
ind1 <- which(var_mode %in% c("logical", "character"))
## coerce logical / character to factor with `as.factor`
if (length(ind1)) dataframe[, c(ind1) := lapply(.SD, as.factor), .SDcols = ind1]
return(dataframe)
}
由于您当前的数据没有触发任何变化,我将更新一个:
DT[,quux:="A"]
head(DT)
# A B C D quux
# <int> <int> <int> <int> <char>
# 1: 1 1 12 15 A
# 2: 1 2 4 6 A
# 3: 1 3 5 7 A
# 4: 1 4 9 1 A
# 5: 1 5 6 14 A
# 6: 2 1 15 13 A
for (i in seq_along(dfs)) cleanfunction(get(dfs[i]))
head(DT)
# A B C D quux
# <int> <int> <int> <int> <fctr>
# 1: 1 1 12 15 A
# 2: 1 2 4 6 A
# 3: 1 3 5 7 A
# 4: 1 4 9 1 A
# 5: 1 5 6 14 A
# 6: 2 1 15 13 A
请注意,for
循环仅依赖于引用更新;来自 cleanfunction
的 return 值在这里被忽略。
由于 data.table
引用语义,此方法完全有效;如果您使用 data.frame
或 tbl_df
,这可能需要使用 assign(dfs[i], cleanfunction(..))
.
cleanfunction(.)
的调用
这对你有用吗?:
# store all dataframes from environment a list
dfs <- Filter(function(x) is(x, "data.frame"), mget(ls()))
#then apply your function
lapply(dfs, cleanfunction)